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PREFACE 


INTRODUCTION 


Calculus is a branch of mathematics that, broadly speaking, introduces concepts and tools to describe and 
analyze functions. Although some parts of calculus were known to the ancient Greeks, Egyptians, and Chinese, 
the modern version of calculus that we use today was largely developed in the 17" century, independently by 
the great mathematicians Isaac Newton and Gottfried Leibniz. Calculus is not only an important branch of 
mathematics in its own right, but also provides the rigorous mathematical foundation of physics, engineering, 
and many other branches of science. 


Unfortunately, most students first learn calculus as a bag of tricks: a number of seemingly unrelated algorithms 
to be memorized and then endlessly applied to problem after problem without motivation. Calculus is sometimes 
seen as the pinnacle of high-school mathematics, and passing a college placement exam in calculus is the ultimate 
goal. Students who learn calculus this way—and this describes most high-school and college students who take 
calculus—will likely never appreciate the beauty or richness of the subject. 


Our goal for this book is to present calculus with a substantial theoretical underpinning. Calculus, at its 
heart, is a few fundamental ideas that come together to create a rich subject, and is not a collection of definitions, 
formulas, and algorithms. Our hope is that students who complete this book will understand calculus, both as a 
theoretical subject and as a problem-solving tool. 


WHO SHOULD STUDY CALCULUS USING THIS BOOK 


The target audience for this book is motivated high school students who have mastered the high school 
curriculum and have developed the mathematical maturity necessary to handle the level of mathematical rigor 
in this text. At a minimum, students must have mastered algebra, plane geometry, and trigonometry before 
considering continuing on to calculus. This is true for any course of calculus study, and especially important for 
the more rigorous calculus treatment in this book. We strongly recommend a robust precalculus curriculum, such 
as that in [Ru], before proceeding with this text. (Note: bold letters in brackets such as [Ru] refer to the References 
on page 303.) 


Students—even highly skilled students—who rush into calculus too soon are likely to be frustrated and will 
not be able to appreciate the richness and subtleties of the subject. Such a student will only learn calculus as a set 
of algorithms to be memorized, and even though this may be sufficient to progress on to the next subject, it will 
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rob the student of a key step in his or her mathematical development. 


We strongly recommend that students, especially younger students, be exposed to both nontrivial problem 
solving and to discrete mathematics (such as combinatorics and number theory) before “continuing on” to calculus. 
Art of Problem Solving has other textbooks and online courses in both of these areas. 


This book in particular is designed for students who want a deeper understanding of calculus than a mainstream 
high-school or college calculus text provides, and who also want exposure to a variety of non-routine calculus 
problems. Specifically, this book differs from a “mainstream” calculus book in two major ways: 


1. A more rigorous presentation, including proofs where applicable. For example, this book begins with set 
theory and the construction of the real numbers, including a discussion of suprema, infima, and complete- 
ness. We cover the rigorous 6-e definition of limit, which is omitted from many calculus texts. We also 
prove many of the important results of calculus, including the Mean Value Theorem and the Fundamental 
Theorem of Calculus; these results are often merely asserted without proof in standard calculus treatments. 


2. An assortment of nontrivial problems. This book has fewer routine “drill-and-kill” exercises than most 
calculus texts, and instead has a wider array of problems that require the students to go beyond rote 
memorization of algorithms and instead to think more deeply about how the different aspects of calculus 
are interrelated. We have taken many nontrivial problems from two of the premier math contests that 
include calculus: the high-school level Harvard-MIT Mathematics Tournament (HMMT) and the college- 
level William Lowell Putnam Mathematical Competition. The HMMT is an annual math tournament 
for high school students, held at MIT and at Harvard in alternating years. It is run exclusively by MIT 
and Harvard students, most of whom themselves participated in math contests in high school. More 
information is available at web.mit .edu/hmmt. The Putnam Competition is a long-running North American 
undergraduate math competition held every December—2014 will be the 75™ annual contest—which consists 
of extremely difficult problems across the entire undergraduate mathematics curriculum. More information 
is at math. scu. edu/putnam. 


Students who are preparing for college calculus placement examinations may wish to supplement their study 
with a test-preparation workbook. The key to success on such placement examinations is repetition of routine 
calculations, which this textbook largely eschews in favor of a variety of more difficult, non-routine problems. 


STRUCTURE OF THE BOOK 


Chapter 1 is a review of important foundational material: sets, real numbers, functions, graphs, trigonometry, 
exponentials, and logarithms. We urge students not to skip this chapter, even if they think that they already know 
this material. We present the material at a level of detail and rigor that students may not be used to, and will be 
introducing terminology and notation that is not typically covered in a precalculus class. 


Chapter 2 introduces the first core idea of calculus: the limit. We present the full 6-e treatment of limits, which 
is often not covered in a high-school calculus course, but we feel that exposure to the rigorous definition of limit 
is an important part of students’ mathematical development. In particular, stating key definitions rigorously is 
necessary to prove later results, and our goal will be to prove (and not merely assert) as much as possible. 


Chapters 3-5 are the heart of the subject: differentiation (Chapters 3 and 4) and integration (Chapter 5). These 
are the core concepts of calculus. 


Chapters 6 and 7 deal with the concept of “infinity,” of which most students have a vague understanding but 
which we will attempt to make precise. Chapter 7 also includes the very important topic of Taylor series, which 
is a fundamental topic of analysis. 
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Chapters 8 and 9 are essentially independent, and each serve as an introduction to the broader world of analysis 
beyond calculus. Chapter 8 covers plane curves, including curves in polar coordinates, and is a nice introduction 
to topics that will be covered more thoroughly in multivariable calculus (typically the next calculus course after 
a student has mastered the topics in this book). Finally, Chapter 9 is an introduction to the theory of differential 
equations, a very broad subject which we only touch on in this book, but which is the subject of (several) courses 
of study beyond calculus. 


Throughout the book, you will see various shaded boxes and icons. 
Concept: _ This will be a general problem-solving technique or strategy. These are the “keys” 


to becoming a better problem solver! 


This will be something important that should be learned. It might be a formula, 
a solution technique, or a caution. 


i Beware if you see this box! This will point out a common mistake or pitfall. 


Sidenote: This box will contain material which, although interesting, is not part of the main 
material of the text. It’s OK to skip over these boxes, but if students read — 
they might learn something interesting! ; 


Bogus Solution: —_Just like the impossible cube shown to the left, there’s something wrong 
Pi with any “solution” that appears in this box. 


Most sections end with several Exercises. These will test students’ understanding of the material that was 
covered in the section. Students should try to solve all of the exercises. Exercises marked with a * are more 
difficult. 


All chapters conclude with several Review Problems. These are problems that test basic understanding of the 
material covered in the chapter. Students should be able to solve most or all of the Review Problems for every 
chapter—if unable to do so, then the student hasn’t yet mastered the material, and should probably go back and 
read the chapter again. 


All chapters also contain several Challenge Problems. These problems are generally more difficult than the 
other problems in the book, and will really test students’ mastery of the material. Some of them are very, very 
hard—the hardest ones are marked with a x. Students should not expect to be able to solve all of the Challenge 
Problems on their first try—these are difficult problems even for experienced problem solvers. 


Many chapters will have one or more advanced sections after the end-of-chapter problems. These sections 
are denoted with a letter (such as 1.A). These sections are optional and often cover topics at a more theoretical 
level than in the main text. Eager students who work through these sections should find them rewarding, but it 
is acceptable to skip them. 
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HINTS 


Many problems come with one or more hints. Readers can look up the hints in the Hints section in the back 
of the book. The hints are numbered in random order, so that when looking up a hint to a problem students will 
not accidentally glance at the hint to the next problem at the same time. 


It is very important that students first try to solve each problem without resorting to the hints. Only after 
one has seriously thought about a problem and is stuck should one seek a hint. Also, for problems which have 
multiple hints, use the hints one at a time; don’t go to the second hint until having thought about the first one. 


SOLUTIONS 


The solutions to all of the Exercises, Review Problems, and Challenge Problems are in the separate Solutions 
Manual. There are some very important things to keep in mind: 


1. Students should make a serious attempt to solve each problem before looking at the solution. Don’t use 
the solutions book as a crutch to avoid really thinking about the problem first. Think hard about a problem 
before deciding to look at the solution. On the other hand, after serious effort has been made on a problem, 
students should not feel bad about looking at the solution if they are really stuck. 


2. After solving a problem, it’s usually a good idea to read the solution. The solutions book might show a 
quicker or more concise way to solve the problem, or it might have a completely different solution method. 


3. If the reader is unable to solve a particular problem and has to look at the solution in order to solve that 
problem, he or she should make a note of it. Then, the student should come back to that problem in a week 
or two to make sure that he or she is able to solve it without resorting to the solution. 


RESOURCES 


Here are some other good resources for students to pursue further their study of mathematics: 


e The Art of Problem Solving’s Precalculus textbook by Richard Rusczyk, in particular the chapters covering 
trigonometry. The other major subjects covered in Precalculus—complex numbers and linear algebra—are 
not necessary for this calculus book, but are very important for students’ future math studies. 


e The Art of Problem Solving books, by Sandor Lehoczky and Richard Rusczyk. Whereas the book that you're 
reading right now will go into great detail of one specific subject area—calculus—the Art of Problem Solving 
books cover a wide range of precalculus problem solving topics across many different areas of mathematics. 


e The www.artofproblemsolving.com website. The publishers of this book also maintain the Art of Problem 
Solving website, which contains many resources for students: 
- adiscussion forum 
- online classes 
— resource lists of books, contests, and other websites 
a 47pX tutorial 
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- amath and problem solving Wiki 


— and much more! 


e Students can hone their problem solving skills (and perhaps win prizes!) by participating in various math 
contests. For U.S. high school students, some of the best-known contests are the AMC/AIME/USAMO 
series of contests (which are used to choose the U.S. team for the International Mathematical Olympiad), 
the American Regions Math League (ARML), the Mandelbrot Competition, the Harvard-MIT Mathematics 
Tournament, and the USA Mathematical Talent Search. Links to these and many other contests are available 
on the Art of Problem Solving website. 


TECHNOLOGY 


Most students who study calculus will do so with the aid of a graphing calculator, and we encourage students 
using this book to do so as well. Once students have mastered the basics, a graphing calculator can remove some 
of the tedium from long calculations, and can also serve as a valuable check of students’ work. Additionally, 
much of calculus is visual in nature, and being able to sketch, quickly and accurately, a graph of a function with a 
few keystrokes is very beneficial. However, students should be aware of the following cautions: 


1. There is a famous saying: “garbage in, garbage out.” That is, a graphing calculator is only as good as its 
user—if you enter bogus data into it, you will get bogus results. You also need to know how to properly 
use your calculator, and to make sure that it is in the correct mode (for example, while doing calculus, your 
calculator should be in “radians” mode and not in “degrees” mode). 


2. Make sure your calculator is sufficiently sophisticated. A “scientific calculator” may not have enough 
features to be broadly useful for calculus. Ideally, your calculator should be able to (a) graph functions with 
an arbitrary viewing window, (b) solve equations numerically, and (c) numerically compute derivatives and 
definite integrals. (Don’t worry if you don’t know what all these things mean yet—that’s what this book is 
for!) Top-of-the-line calculators are also able to do (b) and (c) symbolically (that is, in terms of variables) as 
well as numerically. 


3. If you are planning to take the standardized calculus examination, then make sure your calculator doesn’t do 
too much. While most calculus examinations permit (or even require) the use of calculators, a “calculator” 
that is actually a handheld computer or PDA will likely not be permitted. Check with the organization 
administering your placement test to see if they have a list of approved calculators. 


We also recommend the use of symbolic computation websites. One of the best is Wolfram|Alpha (available at 
wolframalpha.com), which makes many of the features of the computational software Mathematica available on 
the web. (In fact, the author of this book used Wolfram|Alpha to check many of the calculations.) 


WEBPAGE 


The Art of Problem Solving website has a page containing links to websites with content relating to material 
in this book, as well as an errata list for the book. This page can be found at: 
http: //www.artofproblemsolving.com/BookLinks/Calculus/links.php 
If you find an error in this book, please check the above website to see if we have already posted a correction. If 
not, we would greatly appreciate it if you would contact us at books@artofproblemsolving.com and tell us about 
the error, so that we can issue a correction and update the errata list on the website. 
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CHAPTER 


———————_——— 


SETS AND FUNCTIONS 


Before we can dive into calculus, we need to make sure that we have a rigorous understanding of some basic 
mathematical concepts, such as sets, numbers, and functions. Some of these concepts will seem obvious, and one 
can certainly learn the mechanics of calculus without worrying about rigorous details, but to understand calculus, 
it’s important to pay attention to details. 


We will also review some special functions that play a crucial role in calculus: the trigonometric functions, and 
the exponential and logarithm functions. You should already have some prior experience with these functions. 
These types of functions are used throughout calculus, so it is important to have a thorough understanding of 
them. When you first learned about these functions (in a precalculus or intermediate algebra course, for example), 
some of their properties may have seemed like magic. In this chapter we'll try to add a bit more rigor to these 
functions. 


1.1 Sets 


Sets are the building blocks of mathematics. Like many other fundamental mathematical concepts (such as 
“point” or “number”), sets are difficult to define precisely, and we’re not going to try to be terribly precise here. 


Roughly speaking, a set is a collection of objects. The objects can be anything: numbers, functions, other sets, 
any combination of these, or nothing at all. The order of the objects in the set is unimportant. All that matters is 
what objects are in the set. There might only be a finite number of objects in the set (meaning that we could count 
them if we liked), in which case the set is called a finite set. Otherwise we call it an infinite set. The objects in the 
set are called the elements or members of the set. 


There are two basic ways that we can describe a set. The first is to list its elements. For example: 
A = {2,9,22}. 


This is a set with three elements, namely 2, 9, and 22. This is often the simplest way to define or describe a set: 
We list the elements inside of curly braces, and separate the different elements by commas. As we said above, the 
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order of the elements doesn’t matter, so A = {9,22,2} is exactly the same set as A = {2,9,22}. Also, each element 
can only be in the set once, so for example B = {3,6,3} is not a legal set (alternatively, we can think of the second 
“3” in {3, 6,3} as being redundant and write {3, 6,3} = {3, 6}). 


Sometimes it’s impractical to list a big set, so we use ellipses if the pattern of the elements in the set is clear. 
For example, we feel pretty safe describing a set as {1,2,3,...,99, 100} and knowing that this is the set of the first 
100 positive integers. 


If a set is infinite, then we obviously have no hope of being able to list all the elements, since such a list would 
go on forever! But if it is clear which elements are in the set, then we can list the elements using ellipses. For 
example, the set of all positive integers can be written as {1,2,3,...}, because the pattern is clear. As another 
example, we can be pretty sure that {1,2, 4,8, 16,32,...}, without any further description, is the set of all positive 
integers that are powers of 2. Be careful though: you should only do this if your pattern is absolutely clear. Listing 
aset as {1,2,4,...} is pretty ambiguous: is it the set of all nonnegative powers of 2, or the set of all positive integers 
not divisible by 3, or something else that we didn’t think of? It’s not at all clear, so we need more sample elements 
to make the pattern clear, or some words describing the set. 


Aside from listing the elements, the other basic way to describe a set is to provide a property that precisely 
defines the elements of the set. For example: 


B = {x | x is an integer}. 


In this example, the set B consists of all the integers. Some people use a colon (:) instead of the vertical bar; in 
either case, you should read the symbol as “such that.” For example, we would read our set B above as “the set 
of all x such that x is an integer.” Another common example is an interval on the real line; for example, 


{x | x is a real number and 2 < x < 3} 


is the interval of all real numbers that are greater than 2 and less than or equal to 3. One of the major strengths 
of this way of describing a set is that we can use this even if we don’t know explicitly what the elements are. For 
example, 
{y | yis a real number and 2y* — y° + 6y” — 11y + 12 = 0} 


is perfectly valid, even if we don’t necessarily know at first glance exactly what values of y are in the set, or even 
if there are any elements in the set. 


If an object x is an element of S, we write this as x € S. If x is not an element of S, we write x ¢ S. For any object 
x and any set S, either x € S or x ¢ S, but (of course) never both. Indeed, this is the whole point of sets: a set is a 
collection of the objects that belong to it, and everything else does not belong. 


There is a very special set called the empty set, denoted by 9. This is the set with no elements at all. For 
example, 
{x | x is a real number and x” < 0} = 0, 


because there is no real number satisfying the property that its square is less than 0. Note that x ¢ @ for any x. 
We sometimes also write the empty set as a list: 0 = {}. Of course, it’s an empty list, since the empty set has no 
elements. 


If S is a finite set, then we let #(S) denote the number of elements of S. We say that #(S) is the cardinality of 
S. (Note that many sources use the notation |S] in place of #(S).) For example, if S = {2,4,9,11} then #(S) = 4, 
since S has four elements. Note that #(@) = 0, since @ has zero elements. If S is an infinite set (such as the set of all 
integers), then we cannot define #(S) without resorting to so-called transfinite cardinal numbers, which we will not 
need in this book. 


A set A is called a subset of a set B if every element of A is also an element of B. We think of A as a smaller set 
that is made up of some of the elements of B. More informally, we think of A as sitting “inside” B. The notation 
that we use is A C B. For example, 

{3,8,11} € {2,3,8, 10, 11, 14, 16} 
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and 
{x | x is an even integer} C {x | x is an integer}. 


If A is a subset of B, we also say that B is a superset of A. 


Notice that, by definition, every set is a subset of itself. It is convenient to have a notation for when a set is 
a proper subset of another set, meaning that it is a subset but not equal to the larger set. For example, {1,2,3} is 
a proper subset of {1,2,3,4}, but {1,2,3,4} is not a proper subset of {1,2,3,4}, although it is a subset. We use the 
notation A C B to denote that A is a proper subset of B. (This is very similar to the notations < for “less than” and 
< for “less than or equal to”: 3 < 4 and 3 < 4and 4 < 4, but 4 ¢ 4.) Another way to think of this is that A Cc B 
means that A is a subset of B, but there is some element of B that is not in A. For example, {1,2,3} ¢ {1,2,3,4} 
because 4 is not in {1, 2,3}. 


Unfortunately, this notation is not universally agreed upon. Many authors 
use A C B to mean that A is any subset of B, not necessarily proper. Some 
of these authors then use the notation A ¢ B or A & B to mean that A is a 
proper subset of B. 

However, in this book, we will always use A C B to mean that A is a 
subset of B, possibly equal, and use A C B to mean that A is a proper subset 
of B. 


The concepts and notations can get a bit confusing, and it takes a little bit of 
practice to use them properly. For example, if A = {1,2,3}, then it is correct 
to say that 1 is an element of A and that {1} is a subset of A. In notation, we 
would write 

1€A and {i}CA. 


But it is not correct to say that {1} is an element of A. 


Let’s practice with some basic exercises involving elements and subsets: 
Problem 1.1: Consider the following sets: 
A = {1,2,3,4,5}, B = {2,3,4}, C = {3, {4,5}}. 
Is ACA? IsAC A? 


Is BC A? IsBc A? 

bCCAPE Cc Az 

Is4¢€B? Is4eC? 

List all of the subsets of B. How many are there? 


Solution for Problem 1.1: 


(a) The elements of A are 1, 2,3, 4, and 5. All of these elements are elements of A, so A C A. However, A = A, so 
A is not a proper subset of A. Therefore, A ¢ A. 


(b) All of the elements of B are also elements of A, so B C A. Further, A contains elements (namely, 1 and 5) that 
are not in B, so B is a proper subset of A; that is, B c A. 


(c) One of the elements of C is {4,5}. This is not an element of A (it is a subset of A, which is not the same thing). 
Thus C ¢ A and by the same reasoning C ¢ A. 
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(d) 4is an element of B, so 4 € B. However, 4 is not an element of C. The set C has two elements, the number 3 
and the set {4,5}. Thus 4 ¢ C. 


(e) Wecan list the subsets of B: 
O, {2}, {3}, {4}, {2, 3}, {2, 4}, {3, 4}, (2,3, 4} 


(Don’t forget that 0 and B itself are subsets of B. More about 0 C B in Problem 1.2 below.) Thus, there are 8 
subsets of B. 


Note some properties of subsets: 


e Every set is a subset of itself; that is, A C A for any set A. 
e The empty set is a subset of any set; that is, @ C A for any set A. 


e If A,B are two sets such that A C B and B C A, then A = B. In fact, this is often how we prove that two sets 
A and B are equal: we show that A C B and B C A. (Note the analogy to real numbers: if x < y and y < x, 
then x = y.) 


e If A and B are any two sets, then we cannot have both A c B and B c A. (Also note the analogy to real 
numbers: we cannot simultaneously have x < y and y < x.) 


Let’s explain one of these properties now, and you'll be asked to explain the others in the exercises. 


Solution for Problem 1.2: By the definition of subset, we know that @ C A if every element of @ is an element of A. 
But @ has no elements. So it is true that every element of @ is in A, and thus 0 C A. This may seem strange, but 
you can think of it this way: if you can give me an element of @, then I can show that it is in A. I know I’m safe 
making this claim, because I know that you can’t give me an element. 


Many people find the above argument confusing; here’s another way to think about it. We know that @ is 
not a subset of A if and only if there exists some x € @ such that x ¢ A. But such an element x cannot possibly 
exist, because @ has no elements! Thus, is it not true that 0 ¢ A, and therefore 0 C A. (Even though the logic in 
this paragraph uses a double-negative, many people find this argument easier to follow than the argument in the 
previous paragraph.) 0 


Sidenote: The empty set is a weird object, and can lead to counterintuitive conclusions. For 
example, the statement 


If x € 0, then x is a flying yellow pig. 


is a true statement: because @ has no elements, there is no such x. Thus we 
don’t care what the “then” part of the statement says, because the “if” part of the 
statement is never satisfied. 


Just as we can perform operations on numbers, such as addition and multiplication, to get new numbers, we 
can perform operations on sets to get new sets. We start by defining the two primary operations on sets. 


Definition: The union A U B of two sets A and B is the set of all objects that are elements of A or of B. 


We can write this more formally as: 


AUB={x|xeAorxe Bh. 


1.1. SETS 


Note that our use of the word “or” in the definition does not mean “one or the other, but not both.” Instead, we 
use the word “or” to mean “one or the other, or possibly both.” (This is always the way that mathematicians use 
the word “or.”) Elements that are in both A and B are also in their union. 


Here are some examples: 


e {2,3,8} U (1,7, 11,13} = {1,2,3, 7,8, 11, 13}. 


e {4,8,9} U {2,4,9,11} = {2,4,8,9,11}. (We don’t list 4 or 9 twice, even though they appear in both sets, since 
elements are not duplicated in a set.) 


e {1,2,4} U {1,2,4,8} = {1,2,4, 8}. 


e {x | x is an even integer} U {x | x is an odd integer} = {x | x is an integer}. 


Definition: The intersection A B of two sets A and B is the set of all objects that are elements of both A and 


B 


We can write this more formally as: 
ANB={x|xeAandxe B}. 
Some examples: 
e {3,5,9,11} N {2,5,8, 11,13} = {5,11}. 
e {2,6,9,11} N {2,9} = {2,9}. 
e {4,9,11,16} M {2,8, 10, 14} = 0, since these two sets have no elements in common. 
e {x |x is an even integer} M {x | x is an odd integer} = 0. 


e {x |x is an even positive integer} M {x | x is a prime number} = {2}. 


A special case of union and intersection is shown in the following problem. 


Problem 1.3: Suppose A and B are sets such that A C B. 
(a) Whatis A U B? 


(b) What is ANB? 


Solution for Problem 1.3: 


(a) IfA CB, then every element of A is also an element of B. This means that any element in A or B must be in B. 
Therefore, A UB C B. On the other hand, every element of B is also an element of AU B, so B C AUB. Hence, 
AUB=B. 


(b) Any element in A and B must be in A,so ANB C A. On the other hand, every element of A is also an element 
of B, and thus also an element of AN B,so AC ANB. Therefore, AN B= A. 


oO 
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It is useful to have a word describing when two sets have no elements in common: 


Our two operations—union and intersection—are distributive with respect to each other. We’ll prove one of 
the distributive laws and leave the other as an exercise. 


Solution for Problem 1.4: To prove that two sets are equal, we must show that they have the same elements. This 
means that we must show that every element in one set is also in the other set, and vice versa. 


We start by letting x be an element of AM (B UC). This means that x € A and x € (BU C), which means that 
x € Aand ((x € B) or (x € C)). 


This can be rewritten as 
(x € Aand x € B) or (x € Aand x €C). 


Now, rewriting our statement using union and intersection of sets, we have 
xée€((ANB)U(ANC)). 
Thus, all elements of A M (B U C) are also elements of (A N B) U (AN C), which means that 
AN(BUC) Cc (ANB) U(ANC). (*) 


We’re not done! We have to show the reverse as well. To do so, let y be an element of (AM B) U(ANC). Then 
(y € Aand y € B) or (ye Aand yEC). 


This can be written as 
y€Aand (ye BoryeC), 
which means that y € AN (BUC). Thus, 


(ANB)U(ANC) CAN(BUC). (w#) 


Combining (*) and (**), we see that AM (BUC) and (AN B) U(ANC) have the same elements, so they are equal. 
o 


We have one more important operation on sets to define; we will leave its properties as an exercise. 
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That is, A \ B consists of all elements of A that are not in B. For example, if A = {1,2} and B = {2,3}, then 
A \ B = {1}, because 1 € A and 1 ¢ B, and 1 is the only element with this property. Similarly, B \ A = {3}. 


Set theory is a very rich subject, and we have only scratched the surface of it. We could fill a whole book 
discussing set theory (and many such books have been written). But the set theory that we’ve done in this section 
is really all we'll need for calculus. 


EXERCISES 


1.1.1 Show that if A C B and A C C, then A € (BMC). Show that the same statement is not true if we replace 
every “C” with “Cc”. 


1.1.2 Prove that for any sets A, B, and C, 


AU(BNC) =(AUB)N(AUC). 


1.1.3 Prove that A C A for any set A. 
1.1.4 


(a) Is set difference commutative? That is, must we have A \ B = B \ A? (If true, prove it; if false, give a 
counterexample.) 


(b) Is set difference associative? That is, must we have (A \ B) \ C = A \ (B \ C)? (If true, prove it; if false, give a 
counterexample.) 


(c) Prove that for any sets A, B,C, we have 
A\ (BUC) =(A\ B)N(A\C) =(A\B)\C. 
(d) What is A \ 0? What is @ \ A? 
1.1.5* Show that, for any sets A,B,C, 
AU(B\C)=(AUB)\(C\A). 


Hints: 53 


1.2 NUMBERS AND INTERVALS 


There are some special sets that consist of numbers. You’re almost certainly already familiar with these sets, 
but let’s discuss them with an eye towards what makes each of them special. 


We can start with the positive integers, which are our basic counting numbers: 
{L,2,3:4;2<2); 


An average 3-year-old is pretty comfortable with positive integers (although he or she probably doesn’t yet know 
the word “integer”). 
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Sidenote: Some mathematicians call the positive integers the natural numbers, denoted by 
thesymbol N: 


IN = (15273-4514) 
Other mathematicians consider 0 to be a natural number as well, so 


N = (0,1,2,3,4,...}. 


Unfortunately, there is not general agreement in the mathematical community as 
to whether 0 should be a natural number of not. There is similar ambiguity about 
the terms whole numbers and counting numbers, which (depending on who you 
ask) each refer to the set {1,2,3,...} or the set {0,1,2,3,...}. 

We will sidestep the issue by referring to the set {1,2,3,...} as “positive integers” 
and the set {0,1,2,3,...} as “nonnegative integers.” 


But positive integers, by themselves, have a lot of limitations. First of all, zero is missing. Second of all, we 
can’t always subtract: for instance, 1 — 3 is undefined as a positive integer. So we include 0 and negative integers 
to get the entire set of integers, denoted by Z: 


Z,={...,—4,-3,—2,—-1,0,1,2,3,4,...}. 


Integers are a great improvement: now we can add and subtract, and we have 0. In fact, addition in Z satisfies 
these other nice axioms as well: 


¢ commutativity: for anya,b€ Z,a+b=b+a. 

e associativity: for any a,b,c € Z, (a+b)+c=a+(b+c). 

e Zhas an additive identity element, namely 0: for any a € Z,a+0=0+a=a. 

e Zhas additive inverses: for any a € Z, there exists b € Z such thata + b = b+a=0. (Of course, b = —a.) 
Of course, we know that Z has a second operation—multiplication—that is also commutative, associative, and 
has an identity element, 1. Additionally, the two operations together satisfy the distributive property: 

for any a,b,c € Z, a(b + c) = (ab) + (ac). 

However, we don’t have multiplicative inverses; that is, we can’t divide. For example, 2/3 is undefined in Z. 

To remedy this, we move up to the rational numbers, denoted by Q (for “quotient”). 


The elements of Q are ratios of integers: 
m 
Q={™ |m,neZandn +0}. 


Note in particular that we don’t allow 0 in the denominator. 


L 

2 
i 

2 
separate elements of Q; rather they are different ways of representing the same number. 


The set Q together with the operations of addition and multiplication is what is called a field: we can 
add, subtract, multiply, and divide any two elements of Q (except that we can’t divide by 0, of course), and 
these operations satisfy all the nice properties that we want them to have (commutativity, associativity, and 
distributivity). So it seems like we have everything we need for our number system. 
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But there’s still something fundamental missing from Q. For instance: 


Problem 1.5: Show that there is no number x € Q such that x? = 2. 


Solution for Problem 1.5: The proof that follows is one of the most famous examples in mathematics of proof by 
contradiction. Our strategy will be to assume that there is a number x € Q such that x? = 2, and then show that 
this leads to a logical impossibility. This implies that the original assumption cannot be true. 


If x € Q, then we can write x = p/q, where p and q are integers that have no common factors (other than 1 and 
—1). We will show that this assumption leads to a contradiction. Note that p and g having no common factors 
means in particular that p and q are not both even. We will now prove that p and q in fact are both even, which is 
our logical impossibility. 


The fact that x? = 2 gives us (p/q)? = 2, so p? = 2q?. But this means that p is even, so let p’ = p/2. Thus, we have 
(2p’)* = 2q?, which simplifies to 2(p’)* = q?. But this means that q is even. Therefore, p and q are both even. This is 
our contradiction! Thus, our assumption was false, and there is no number x € Q such that t=2.0 


Of course, when we extend our set of numbers to the real numbers (denoted R), a solution to x? = 2 exists, 
namely ¥2. But what exactly are real numbers? In other words, how can we define numbers such as ¥2 using 
what we already have, which (for the moment) is just rational numbers? 


One idea starts with how we naturally think of 2, and that is as the positive number whose square is 2. 
Slightly more formally, we can “define” V2 as the positive solution to the equation x? —2 = 0, which is an equation 
that only contains rational numbers. When we add all such numbers to the set Q—that is, we include all numbers 
that are solutions to some polynomial equation with rational coefficients—we get the set of algebraic numbers, 
denoted ©. But O still lacks of lots of useful numbers, such as 71, or indeed any other transcendental number that 
is not the solution of a polynomial equation with rational coefficients. So what can we do? 


Instead, we are motivated by the idea that we typically think of IR as being equivalent to the “number line”: 
a line in which every point represents a real number, and in which every real number corresponds to a unique 
point on the line. When thinking of numbers as lying on a number line, we see that Q, by itself, contains lots of 
“holes” on the line. 


For instance, to go back to our earlier example, we can think of the number V2 as the number that fills the hole 
between the sets {x € Q | x < 0 or x? < 2} and {x € Q| x > 0 and x? > 2}. Note that we can very easily describe the 
above two sets just using rational numbers, and the sets contain only rational numbers. But the “hole” between 


them is the number ¥2. When we fill in all the numbers that create such holes in Q, we get the real numbers R. 


It is possible, with a lot of work, to make the construction of R rigorous. But for the purposes of our study 
of calculus in this book, it’s enough to think informally of the real numbers as all the points on the number line 
without any holes. The essential axiomatic fact about IR that we'll need for calculus is called completeness, which 
essentially means “no holes.” Before stating exactly what completeness is, we'll need a few definitions: 


Definition: Let S be a subset of R. 


e A number x € R is called an upper bound for S if for all y € S, we have y < x. If an upper bound for S 
exists, we say that S is bounded above. 


e A number x € R is called a lower bound for S if for all y € S, we have y > x. If a lower bound for S 
exists, we say that S is bounded below. 


e If S has an upper bound and a lower bound, we say that S is bounded. 


Some bounds are very special: 
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Definition: Let S be a subset of R. 


e Anumber x € R is called a least upper bound (or supremum) of S if x is an upper bound for S and if 
for every z € R such that z is an upper bound of S, we have x < z. 


e Anumber x € Ris called a greatest lower bound (or infimum) of S if x is a lower bound for S and if for 
every z € R such that z is a lower bound of S, we have x > z. 


In other words, an upper bound of a subset S C IR is any number greater than or equal to all of the numbers in 
S. If we can find a smallest possible upper bound for S, then this number is the least upper bound of S. Similarly, a 
lower bound of S is a number that is less than or equal to all of the numbers in S. If we can find a greatest possible 
lower bound for S, then this number is the greatest lower bound of S. 


Solution for Problem 1.6: Suppose x and y are each least upper bounds of S. Then x < y (since x is a least upper 
bound and y is an upper bound), but also y < x (since y is a least upper bound and x is an upper bound). Since 
x < yand y < x, we must have x = y. So the least upper bound is unique. 0 


The least upper bound, or supremum, of a subset S C R is denoted sup S (if it exists). Similarly, the greatest 
lower bound, or infimum, of S is denoted inf S (if it exists)—we will leave it as an exercise to prove that if it exists, 
it must be unique. 


We can now precisely say what it is that makes R “better” than Q: 


Important: __R is complete, meaning that every nonempty subset S C R that has an upper 


VY bound has a least upper bound. 


This may seem obscure, but it is the fundamental property that distinguishes R from Q. By way of example, 
let’s again look at V2 by considering the set 


A={x€Q|x <2}. 


As a subset of Q, this subset certainly has an upper bound (2 is a trivial example of an upper bound of A), but it 
has no least upper bound: there is no single number in Q that is an upper bound of A, but smaller than all other 


upper bounds of A. However, as a subset of R, the set A has a least upper bound: sup A = ¥2 because V2 is an 
upper bound of A (that is, all numbers in A are less than or equal to V2) and any other upper bound of A must be 
greater than V2. 


The most important subsets of IR that we will be regularly using are intervals: 


Definition: A subset I of R is called an interval if, for any a,b € I and x € R such that a < x < b, we have 
xel. 


Informally, an interval I is a subset of IR without any “holes”—if a and b are in I, then all the numbers between 
a and bare also in I. Two types of intervals are most commonly used: 
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Definition: Let a < b be any two real numbers. The open interval (a,b) is defined as the set 


(a,b) ={xER|a<x<b}}. 
The closed interval [a,b] is defined as the set 
[a,b] ={xER|asx< bd}. 


It is easy to check that open intervals and closed intervals are indeed intervals (we leave this as an exercise). 
We think of intervals as segments of the real line. Open intervals do not include the endpoints, but closed intervals 
do. We denote intervals on a number line as follows: 


<n 


b a 
(a,b) [a,b] 


Notice that we use an open circle to indicate that the endpoint of the interval is not included in the interval (as in 
the case of an open interval), and a filled-in circle to indicate that the endpoint of the interval is included in the 
interval (as in the case of a closed interval). 


We also often use the following notations for intervals that include one endpoint but not the other: 
[a,b) ={xER|a<x<bD}, 
(a,b] ={xER|a<x<b}. 
Such intervals are referred to as half-open intervals. 
Problem 1.7: Leta < b be any two real numbers. 
(a) What are sup[a, b] and inf[a, b]? 
(b) What are sup(a,b) and inf(a, b)? 


(c) Comparing your answers to (a) and (b), what is the sain). jogs feature that differentiates [a, a from 
(a,b)? 


Solution for Problem 1.7: 


(a) Ifx € [a,b], thena < x < b. This immediately tells us that a is a lower bound and b is an upper bound. In 
particular, by definition sup[a, b] < b, since any upper bound (in particular b) is at least sup[a, b]. On the other 
hand, b < sup[a,b] since b is in the set [a,b]. Thus b = sup[a, b]. Similarly, a = inf[a, b]. 

(b) If x € (a,b), thena < x <b. As in part (a), this immediately tells us that a is a lower bound and b is an upper 
bound. Any real number less than b (but greater than a) is in the interval, so no number less than b can be 
an upper bound. Therefore, sup(a,b) > b. But sup(a,b) < b by definition (since b is an upper bound), so 
sup(a,b) = b. Similarly, a = inf(a, b). 

(c) From (a) and (b), we see that sup[a,b] = sup(a,b) = b and inf[a,b] = inf(a,b) = a. The difference is that in 
part (a), the supremum and infimum are elements of the interval, whereas in part (b), they are not. In other 
words, sup[a, b] € [a,b] and inf[a, b] € [a,b], but sup(a, b) ¢ (a,b) and inf(a, b) ¢ (a,b). 


The last problem motivates a new definition: 


Definition: Let S be a subset of R. If sup S € S, we say that sup S is the maximal element of S (or simply the 


maximum of S). If inf S € S, we say that inf S is the minimal element of S (or simply the minimum of S). 
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As we saw in Problem 1.7, although a bounded subset of IR must have a supremum and an infimum, it need 
not have a maximum or minimum. 


We often use unbounded intervals too: 


Because we never think of +00 or —co as a number, we don’t think of the interval as containing +00 or —oo, and 
thus the “infinite” end of the interval is always open. In particular, we never would write [a, +0]. 


Solution for Problem 1.8: By construction, the interval [a,+0o) has no upper bound, since it contains arbitrarily 
large elements. Thus sup[a, +o) is undefined. On the other hand, inf[a, +co) = a by the same argument that shows 
inf[a, b] = a for any b>a. 0 


Finally, we can write the entire set of real numbers as an interval as R = (—00, +00). 


EXERCISES 
1.2.1 Let S c R. Show that if S has a greatest lower bound, then this greatest lower bound is unique. 


1.2.2. Prove that for any a < b, the open interval (a, b) and the closed interval [a, b] are indeed intervals (using our 
original definition of interval). Hints: 243 


1.2.3 


(a) Show that the intersection of any two intervals is an interval. 
(b) Show that the union of any two intervals need not be an interval. Hints: 123 


1.2.4x If A,B are bounded intervals with A N B # 0, then show that 
sup(A O B) = min{sup A, sup B}. 
Hints: 113, 220 
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A function is a mathematical device that associates an input with a unique output. For example, we write 
f(x) =3x-5 


to mean the function which, for any given input x € IR, outputs the real number 3x — 5. The x in our function 
definition above is a dummy variable that takes the place of the input. This dummy variable can be anything, so 


f(x)=3x-5, f()=3t-5, f(é)=3E-5 
are all representations of the same function. 
More formally: 


Definition: A function f from a set A to a set B, denoted f : A — B, associates to each a € A an element 
f(a) € B. The set A is called the domain of f and the set B is called the codomain of f. We let Dom(f) denote 


the domain of f and Cod(f) denote the codomain of f. 


In calculus, the codomain of our functions will almost always be IR, and you can assume that a function has 
codomain R unless otherwise specified. Such a function is called real-valued for the obvious reason. If we don’t 
explicitly specify a domain, then it is understood that a function has as its domain the largest subset of IR for which 
the function definition makes sense. For example, the function f(x) = Vx has domain [0, +00), since we cannot (in 
IR) take the square root of a negative number. However, sometimes (depending on the situation) we may find it 
convenient to consider a function whose domain is a subset of its “allowed” domain. 


There is another point to make about f(x) = x. Even though the codomain of this function is R (as it will be 
for virtually all functions we consider in this book), the function only actually outputs values that are nonnegative 
real numbers; that is, f(x) > 0 for all x € Dom(f). To indicate this, we say that the range of f is [0, +00). 


Definition: The range of a real-valued function f, denoted Rng(f), is the set of values that f outputs; that is, 


Rng(f) = {y € R| y = f(x) for some x € Dom(f)} = {f(x) | x € Dom(f)}. 


As a simple example: 


Solution for Problem 1.9: There are two conditions on the domain of f. First, since we can only take the square 
root of a nonnegative number, we must have x” — 1 > 0. But we also cannot have a denominator equal to zero, so 
we must have x? — 1 > 0. This means that x? > 1, so x > 1 or x < —1. Thus, the domain is (—00, —1) U (1, +00). 


Since the denominator can be any positive real number, the range is (0, +09). O 


We can perform various operations on real-valued functions: 


e We can add two functions f and g: 
(f + s)(x) = f(x) + 8). 
Note that this is only valid at x where both f and g are defined, so Dom(f + g) = Dom(f)M Dom(g). Similarly 
we can subtract, multiply, and divide: 


(f — g)(x) = f(x) -— g@), (fax) = f@)g@), (f/8)(x) = f(x)/g()- 
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All of these have the same domain as f + g, with the exception of f/g that has the added restriction that 
g(x) #0. 


e We can multiply a function f by a constant c € R: 
(cf\(x) = c+ fx). 
It is clear that Dom(cf) = Dom(f). 


e We can compose two functions: 
(f © g)(x) = f(g(x)). 


Note that, for (f o g)(x) to be defined, we must have x € Dom(g) and g(x) € Dom(f). Notice also that the 
order of the functions is important: f o g is in general not the same as g 0 f. 


As we progress through the text we will see additional operations that we can perform on functions. 


Composition also gives us a new notion: 


Definition: Let f be a real-valued Gitlin A jecicakied fuiicleens gis called an inverse of f if f(g(x)) =x 


for all x € Dom(g) and g(f(x)) = x for all x € Dom(f). We denote this by g = fh 


Notice that if g is an inverse of f, then f is also an inverse of g, by the symmetry of the definition. Further, 
since f and g essentially “undo” each other, the result of the next problem should make sense: 


Problem 1.10: Show that if g is an inverse of f, then Dom(f) = Rng(g) and Dom(g) = Rng(f). 


Solution for Problem 1.10: Since g(f(x)) = x for all x € Dom(f), we see that x € Dom(f) implies that x € Rng(g). So 
Dom(f) € Rng(g). On the other hand, if y € Rng(g), then y = g(x) for some x € Dom(g), and thus f(y) = f(g(x)) = x. 
In particular, y € Dom(f), so Rng(g) C Dom(f). Combining these two subset inclusions gives us Dom(f) = Rng(g). 


By the symmetry of the definition of inverse, we also have that f is an inverse of g, and thus the previous 
argument shows that Dom(g) = Rng(f). 0 


The next problem shows that we can refer to “the” inverse of f rather than “an” inverse of f: 


Problem 1.11: Show that if f~! exists, then it is unique; that is, show that if g and h are both inverses of f, then 


gah. 


Solution for Problem 1.11: By Problem 1.10, we know that Dom(g) = Rng(f) = Dom(h). Let x be an element in the 
shared domain of g and h; we will show that 9(x) = h(x). Since x € Rng(f), there is some y € Dom(f) such that 
f(y) =x. But then 


g(x) = g(f(y)) = y = ACf(y)) = h@). 
So g(x) = h(x) for all x in their (shared) domain, and thus the functions are equal. O 


There are a couple more definitions regarding functions that will be useful for us to have later on. In particular, 
it’s useful to have a way to describe what happens to a set when we apply a function at each point in the set, and 
it’s useful to describe all the points that get mapped to a certain set. 
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Definition: Let f be a function and A C Dom(f). The image of A under f, denoted f(A), is 


f(A) = ty € Cod(f) | y = f(x) for some x € A}. 
Let B C Cod(f). The preimage of B under f, denoted f~'(B), is 
f-\(B) = {x € Dom(f) | f(x) € B}. 


Informally, the image of A is the set where A “gets sent to,” and the preimage of B is the set “that gets sent to” 
B. Notice that, by definition, f(Dom(f)) = Rng(f) and f-'(Rng(f)) = Dom(f). 


Note that f~' has two different but related meanings. If f has an inverse, and 
x € Rng(f), then f~!(x) denotes the image of x under the inverse function of 
f. On the other hand, if B C Cod(f), then f~'(B) denotes the preimage of B 
under f; we do not require that f have an inverse in order to define f~!(B). 
In the former, x is a number, whereas in the latter, B is a subset. 


Here is an example of this notation: 


Problem 1.12: Let f(x) = x? - 3. 
(a) Find f((0,3)) and f([0,3}). 


(b) Find f-1((1,2)). 
(c) Find f-1((-5, -4)). 


Solution for Problem 1.12: 


(a) Note that f(0) = —3 and f(3) = 6. Furthermore, we see that f is “continuously” increasing from —3 to 6 as we 
increase x from 0 to 3. Thus we should have that f((0,3)) = (—3,6) and f([0,3]) = [—3,6]. We have somewhat 
cheated a little here: the word “continuously” is not well-defined, and is certainly not rigorous. But if you 
glance ahead a little bit, you'll see that this is the topic of the next chapter. 


(b) Ifx?-3 =1, then x2 = 4,so x = +2. Also, if x2 —3 = 2, then x2 = 5,sox = + V5. Thus, 
f-1((1,2)) = (- V5,-2) U (2, V5). 


This shows that the preimage of an interval, even using a “nice” function like x? — 3, does not necessarily 
have to be an interval. 

(c) There is no value of x such that —5 < f(x) < —4, since f(x) = —-3 for all x € IR. Hence f~'((—5, —4)) = @. More 
generally, if B contains no values in Rng(f), then f~!(B) = 0. 

O 


Images and preimages are not necessarily inverse operations, despite their suggestive notation. The next 
problem is trickier than it looks: 


Problem 1.13: 
(a) Let f bea function and A c Dom(f). What is the relationship between A and f~!(f(A))? 


(b) Let f bea function and B c Cod(f). What is the relationship between B and f(f~!(B))? 
(c) Let f bea function and B c Rng(f). What is the relationship between B and f(f~'(B))? 


Solution for Problem 1.13: 
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(a) If x € A, then by definition f(x) € f(A). And since x is an element that gets mapped into f(A), we have 
x € f-1(f(A)). This means that A ¢ f~!(f(A)), since every element of A is also an element of f~!(f(A)). 
However, if we try to reverse this argument, it doesn’t work. Indeed, consider the example f(x) = x? and 
A = [0,2]. Then f(A) = f({0,2]) = [0,4], and f-1(f(A)) = f-1(0,4]) = [-2, 2], which is strictly larger than the 
original set A. Thus, all we can say is that A C f~!(f(A)). 

(b) If y € f(f-'(B)), then by definition there is some x € f~'(B) such that f(x) = y. But, again by definition, 
x € f-'(B) means that f(x) € B, and since y = f(x) this means y € B. Hence, every element in f(f~'(B)) is 
also in B, and thus f(f~!(B)) € B. As in part (a), the converse is not true: for example, take f(x) = x* and 
B = [-4,4]. Then f~'(B) = [-2,2] and f(f~!(B)) = f({-2,2]) = [0,4] ¢ B. Thus all we can say is f(f~!(B)) € B. 

(c) The reason that the converse fails in part (b) is that B might contain elements that are not in the range of f. 
But if B C Rng(f), then every element y € B satisfies y = f(x) for some x € Dom(f). Then we have x € f~1(B), 
and thus y = f(x) € f(f~'(B)). This proves that B ¢ f(f~'(B)), and this combined with part (b) gives us 
B= f(f-'(B)). 


Gl 
EXERCISES 
x-2 
1.3.1 Find the domain of f(x) = ————___. 
f Vx2 = 7x +12 


1.3.2 Find the domain and range of f(x) = |2x — 3] +5. 
1.3.3 Cana function have domain 0? Can a function have range 0? Explain your answers. Hints: 229 


1.3.4 Show that if A C B and f is a function whose domain includes B, then f(A) € f(B). If A is a proper subset, 
that is if A C B, must we have f(A) Cc f(B)? 


1.3.5* If f is a function with inverse f~', and y € Rng(f), what is the relationship between f~'(y) and f~'({y})? 
Hints: 103 


1.4 GRAPHS OF FUNCTIONS 


You should already be familiar with graphing functions. For example, the y 
picture to the right shows the graph of y = x* — 2x — 3. The investigation 
of geometric properties of graphs of functions is one of the cornerstones of 
calculus, and we will be dealing with graphs of functions throughout the book. 
In this section, we'll give a slightly more formal definition of a graph than you 
may be used to, and we'll review the important basics about graphs of linear 
functions. It’s important to have these basics mastered because you'll need 
them to be almost automatic as we continue through the book. x 


Graphs are constructed on the Cartesian plane, as in our picture to the right. 
The Cartesian plane is the set of all ordered pairs of real numbers; in other words, it is: 


{(x, y) | x,y € R}. 


This is often denoted IR x R or R?. 
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You should know from your past mathematical experience that a point (x, y) is in the graph of a function f(x) 
if and only if y = f(x). In other words, the graph of f is the set of all points (x, f(x)) where x € Dom(f). We’ll use 
this as our formal definition of a graph: 


Definition: The graph of the real-valued function f is the set 


{(x, f(x) | x € Dom(f)}. 


This may seem like a strange definition: we’ve defined a graph to be a set rather than a geometric object. But 
sets and functions are the things that we can handle most rigorously. Of course, when we draw the picture of a 
graph in the Cartesian plane, we are identifying the graph with a geometric object. We'll use the word “graph” 
interchangeably to mean both the set of ordered pairs (x, f(x)) and the geometric object that results when we plot 
all of these order pairs on the Cartesian plane. 


We'll start with the fundamental geometric property of graphs of functions: 


Problem 1.14: Let f be a function. Explain why every vertical line in the plane intersects the graph of f in at 


most one point. 


Solution for Problem 1.14: For any a € R, the vertical line x = a in the plane is the subset {(a, y) | y € IR}. However, 
there is at most one point in the graph of f with first coordinate a, and that is the point (a, f(a)). (This is just a 
restatement of the fact that a function produces a unique output for each input.) Therefore, the vertical line x = a 
will intersect the graph of f in the point (a, f(a)) if a € Dom(f). Of course, if a ¢ Dom(f), then there is no point in 
the graph of f with first coordinate a, and hence the line x = a does not intersect the graph of f at all. 0 


This is called the Vertical Line Test: 


The Vertical Line Test: for any function f, any vertical line (that is, any line 
given by x = a for some constant a € R) must intersect the graph of f in at most 
1 point. Conversely, if there is a subset S of the Cartesian plane for which there 
is some vertical line x = a that intersects S in more than 1 point, then S cannot 
be the graph of a function. 


The intuition of the converse portion of the Vertical Line Test should be clear: if y 
some line x = a intersects S in two distinct points (a,b) and (a,c), then there’s no way 
S could be the graph of a function f, since we’d need f(a) = b and f(a) = c, but f(a) 
can only equal one number. 


There’s no such restriction about horizontal lines. For example, looking back at our 
parabola y = x* — 2x —- 3, shown at right, there are lots of horizontal lines that intersect 
the graph in more than one point. However, graphs in which every horizontal line 57 
intersects the graph in at most one point have a very special property: 


Problem 1.15: Let f be a function, and suppose that every horizontal line y = b intersects the graph of f in at 


most one point. Show that f has an inverse f~'. 


Solution for Problem 1.15: Suppose the line y = b intersects the graph of f at (a,b). We know that b = f(a) by 
definition, since it lies on the graph of f, so our inverse for f will have to map b back to a. This tells us how to 
define the inverse. 


Define a function g as follows: if the line y = b intersects the graph of f at the point (a,b), then g(b) = a. Note 
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that Dom(g) = Rng(f). Since any such horizontal line will intersect the graph in at most one point, the function g 
is well-defined. By construction, we have g(f(@)) = a for any a € Dom(f). Furthermore, if b € Dom(g), then the 
line y = b intersects the graph of f at some point (a,b), so f(g(b)) = f(a) = b. 


Thus g = f!.0 
This leads to: 


The Horizontal Line Test: given a function f, if every horizontal line (that is, 
any line given by y = b for some constant b € R) intersects the graph of f in at 


most 1 point, then f has an inverse f~'. Conversely, if there is some horizontal 
line y = b that intersects the graph of f in more than one point, then f does not 
have an inverse. 


We have not proved the converse part of the Horizontal Line Test—we will leave this as an exercise. 


The most basic and important graphs in calculus are lines. You should already know how to work with lines 
in the Cartesian plane. Specifically, the graph of f(x) = mx +b is the line with slope m that passes through the point 
(0,b). If m = 0, this is just the horizontal line y = b. Note that a vertical line x = a is not the graph of a function, 
since it does not pass the Vertical Line Test. 


Conversely, given a non-vertical line @ in the plane, we can select any two distinct points on the line, (x1, y1) 
and (x2, ¥2). Then the slope of the line is 
he y2- yi ; 
Bs Mar’ 5 | 
and thus ¢ is the graph of y = mx + b, where (0, b) is the y-intercept of the graph. 


The last important graph skill that we'll cover in this section is the ability to identify and draw “related” graphs 
of a function. The following exercise is typical: 


Problem 1.16: Shown at right is the graph of a function f. Sketch the graphs of 
the following related functions (don’t worry about exact accuracy or scale, just try 
to show the general behavior): 


g(x) = f(x +1) 


3(x) = f(x) +1 
g(x) = f(-x) 
g(x) = f(2x) 
(x) = 2f(x) 


Solution for Problem 1.16: In all of the pictures in this solution, the original graph of f is shown as a dotted curve, 
and the graph of g is shown as a darker, solid curve. 


(a) Note that if (x, y) is on the graph of f, then (x — 1, y) is on the graph of g, since 
g(x —1) = f((x-1) +1) = f(x) = y. 


In other words, each point on the graph of g is 1 unit to the left of the corresponding point on the graph of f. 
Thus, the graph of g is just the graph of f shifted 1 unit to the left, as in picture (a) below. 

(b) If (x, y) is on the graph of the f, then (x, y + 1) is on the graph of g, since g(x) = f(x) + 1 = y +1. That is, each 
point on the graph of g is 1 unit above the corresponding point on the graph of f. Thus, the graph of g is just 
the graph of f shifted 1 unit upwards, as in picture (b) below. 
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(a) (b) (c) 


(c) If (x,y) is on the graph of f, then (—x, y) is on the graph of g, since g(—x) = f(x) = y. Thus, every point (x, y) 
on the graph of f is reflected over the y-axis to the point (—x, y) on the graph of g. Thus, the graph of g is the 
graph of f reflected over the y-axis, as in picture (c) above. 

(d) If (x,y) is on the graph of f, then (x/2, y) is on the graph of g, since g(x/2) = f(x) = y. So each point (x, y) 
on the graph of f gets contracted horizontally by a factor of 2 towards the y-axis to the point (x/2, y) on the 
graph of g. Thus, the graph of g is the graph of f contracted horizontally by a factor of 2, as in picture (d) 
below. 


(d) 


(e) If (x,y) is on the graph of f, then (x, 2y) is on the graph of g, since g(x) = 2f(x) = 2y. Thus, each point (x, y) on 
the graph of f gets expanded vertically by a factor of 2 away from the x-axis to the point (x, 2y) on the graph 
of g. Therefore, the graph of g is the graph of f expanded vertically by a factor of 2, as in picture (e) above. 


a) 


EXERCISES 
1.4.1 Prove that if f is a function and there is a horizontal line y = b that intersects the graph of f in more than 


one point, then f cannot have an inverse. (This is the “converse” part of the Horizontal Line Test.) 
1.4.2 Suppose that f(x) = x? — 2x and g(x) = V1-x. 

(a) Find the domain and range of f and g. 

(b) Find (f © g)(x), (go f)(x), and their domains and ranges. 

(c) Graph f(x) and f(x — 2) +3. 

(d) Graph g(x), 49(2x), and —g(—3x + 1). 


1.4.3 Show that the converse of the Vertical Line Test is true: that is, if A C IR x R satisfies the Vertical Line Test, 
then A must be the graph of some real-valued function f. 
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1.5 TRIGONOMETRIC FUNCTIONS 


The discussion of trigonometric functions in this section is intended to be 
review. You should already have significant experience with trigonometry 
before attempting to learn calculus. Art of Problem Solving’s Precalculus 
textbook contains a thorough treatment of trigonometry. 


WARNING!! 
“S 


You are already familiar with the trigonometric functions of a right triangle. If A is an acute angle of a right 
triangle, then: 


B 
_ length of side oppositeA a 
~ length of hypotenuse —c’ 
length of side adjacent toA . a 
cos A = ————_____—____——_ = -, 
length of hypotenuse c 
ud = length of side oppositeA —a_ sinA 
— length of side adjacenttoA b  cosA’ 
A b c 


Because any two right triangles with the same angles are similar, the ratios of corresponding sides are equal. 
Therefore, the trigonometric functions depend only on an angle and not on a particular choice of triangle. 


In calculus (as with most of higher mathematics), we always measure our angles in radians. Radians are 
a much more natural system of measurement than degrees (which depend on the fairly arbitrary choice of 360 
degrees in a circle). 


Definition: Let C be a circle of radius 1, and let A be a central angle of C (that is, an angle 
with its vertex at the center of C). The measure of A in radians is equal to the length of the 
arc that is subtended by A. 


In particular, we can compute the radian measure of some common angles: 


e A right angle measures 3, since it subtends one-quarter of the circle. The circumference of the circle is 27, 


so the length of the subtended arc is 44 = §. 


e Inan isosceles right triangle, the two acute angles each measure 7. 


The key fact to remember when computing radian measures is that 
2n radians = 360 degrees. 


So, we can always easily convert between degrees and radians by setting up a ratio. 
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Problem 1.17: 


(a) Suppose angle A measures d degrees. What is its measure in radians? 
(b) Suppose angle B measures r radians. What is its measure in degrees? 


Solution for Problem 1.17: 
(a) Let the radian measure of A be r. Then we must have the ratio 


r 2K 


d- 360° 
Solving for r, we see that r = d (355): 


(b) Let the degree measure of B be d. Then we must have the ratio 
d_ 360 


r n° 


Solving for d, we see that d = r (222). 


WARNING! __ If you are using a calculator in your study of calculus, make sure that you 
put it in “radians” mode and not in “degrees” mode. 


There are two basic right triangles that you should know. They are 
shown at right with their angles labeled in radians. You should already 
know these triangles from your previous study of geometry. In degrees, 
the one on the left is a 30-60-90 triangle and the one on the right is a 45- 
45-90 triangle. But from now on, we always want to be thinking about 2 V3 
them in terms of radians. We can make the following chart of the trig 
functions for the acute angles in these triangles: v2 1 


You should learn the above table. It should be internalized to the extent that it is 
as basic as addition. You shouldn’t have to think very hard about what sin (2) is; 


it should be automatic that this is }. 


Using right triangles, we can define our trigonometric functions for any angle 0 such that 0 < @ < 4, since any 
acute angle measures between 0 and $ (in radians, of course!). 
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However, we want sin and cos to be functions with domains of all of R. 


y , 
To extend their definitions beyond acute angles, we use the unit circle. The (cos @, sin ) 
unit circle is the circle with radius 1, centered on the coordinate plane at the 
origin (0,0). We construct a right triangle with one leg along the x-axis and 
« 


hypotenuse equal to a radius of the circle, and with acute angle @ in the first 


quadrant, as in the picture to the right. We can then determine the lengths of x 
the sides of the triangle, and we see from the picture that the point on the circle 
corresponding to the angle 0 is (cos 6, sin 0). 


A common mistake is to incorrectly write points on the unit circle with their 
coordinates backwards. Remember, cos is the x-coordinate and sin is the 


y-coordinate. (An easy way to remember this is that cos comes before sin 
alphabetically, just as x comes before y.) 


The advantage of using the unit circle to define sine and cosine is that we . 
can easily extend this to any angle, as in the picture to the right. The angle 9 (cos 6, sin @) 
is between 5 and 7 in radians. Just as we did with acute angles, we can define 
(cos 8, sin @) to be the point on the circle corresponding to a central angle 0. 
Note now, though, that cos 0 is negative, since the x-coordinate of the point on 
the circle corresponding to the angle @ lies in the second quadrant. 


You should become proficient in computing the sine and cosine of any angle 
that is an integer multiple of 7 or 7. Here is an exercise to practice: 


Problem 1.18: 
(a) Compute cos ( a) and sin (2) 


mn) (b) Compute cos (2) and sin (7). 
(-#). 


(c) Compute cos (-42) and sin 


Solution for Problem 1.18: 
(a) We draw an angle of 6 = @ at right. We see that this puts us in the third 
& 6 & P 


y 
quadrant, so both the cosine and the sine will be negative. We can now 
essentially read from the picture that cos (2) = -¥# and sin (2) =~}. 
BT x 
(cos 0, sin 0) 


(b) We draw an angle of 6 = at right. We see that this puts us in the fourth 
8 4 & Pp 


quadrant, so the cosine will be positive and the sine will be negative. We 


have that cos (2) = ¥ and sin (2) = ~%, 


(cos 0, sin 0) 
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(c) We draw an angle of 0 = —* at right—note that this angle goes in the (cos 6, sin @) 


clockwise direction starting from the positive x-axis, because the angle is 
negative. We have that cos (-%) = —} and sin (— ) = ¥. Ks 


We define tangent for (almost) any angle in terms of sine and cosine. Specifically, for any angle @ such that 
cos 6 # 0, we define 


oO 


If cos 8 = 0, then tan @ is undefined. 


Problem 1.19: 
(a) Compute sin 0, cos 0, and tan 0. 


(b) Compute sin 3, cos 5, and tan $. 


Solution for Problem 1.19: 


(a) An angle of 0 means that we don’t move off the positive x-axis at all. Hence the point on the unit circle 
corresponding to an angle of 0 is (1,0), the ee where the circle intersects the positive x-axis. Thus, cos0 = 1 


and sin0 = 0. Also, we have tan0 = sind = => =0: 


(b) An angle of } means that we move pe of the way around the circle (counterclockwise parting at 
the re x-axis). Hence the point be es to an angle of $ radians is the point (0,1). Thus cos $ = 0 


and sin $ = 1. Then, we have tan $ = a = +, which i is tes So tan F is undefined. 


cos 


O 


| Problem 1.20: What are the domains and ranges of sine, cosine, and tangent? j 


Solution for Problem 1.20: Using our unit circle formulation, we see that we can define sine and cosine for any real 
angle, so Dom(sin) = Dom(cos) = R. Since all points on the unit circle have coordinates that are between —1 and 
1 (inclusive), we have Rng(sin) = Rng(cos) = [—1, 1]. 


Tangent is defined only for angles at which cosine (the denominator of tangent) is nonzero. By looking at the 
unit circle, we see that cosine is 0 at the angles where the circle crosses the y-axis, so cos(@) = 0 if and only if @ is 
an odd integer multiple of $, that is, if and only if 


ae{ _5n _3n _t Tm 3m 57 
Ait 29 ye 


Thus we get that 
Dom(tan) = R\{> +nk| ke Zh. 


It is also easy to see that Rng(tan) = R. oO 


Problem 1.21: Let @ be any angle. How is sin(@ + 272) related to sin 6? 
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Solution for Problem 1.21: Adding 27 to an angle 6 corresponds to making a complete revolution around the unit 
circle, since the entire circle has angle 27. So 0 and @ + 27 give us the same point on the unit circle, and thus 
sin(@ + 27) = sin @ for any angle 0. 0 


Of course, the same argument shows that cos(@ + 27) = cos 6 and tan(@ + 27) = tan @. This sort of situation 
comes up frequently in mathematics, so we have some terminology for it: 


In light of Problem 1.21, we can show that the sine and cosine functions are periodic with period 27. (In fact, 
tangent is periodic with a smaller period—we will leave this fact as an exercise.) Graphically, this means that the 
graphs y = sinx and y = cosx repeat every 27 units along the x-axis. We can sketch the graphs by plotting the 
points that we know: the points where x is a multiple of 7/6 or 7/4. Plotting these and then “connecting the dots” 
gives us our graphs of sine and cosine: 


y y = cosx 


You probably notice that the graph of cosine appears to be the graph of sine shifted } units to the left. In fact, 
cos x = sin (x + 9) for all x. You’ll be asked to explain this and other trig relationships as an exercise. 


There are three other trigonometric functions that are commonly used. They are simply the reciprocals of the 
functions that we already have. 


secant: secQ = - 


‘os 8” 
1 
cosecant: csc@ = ——, 
sin @ 
cos @ 
cotangent: cot@ = ’ 
8 sin 0 


. 1 ‘ ‘ . , 
Notice that cot 0 = mut where both sides are defined. However, we have cot 5 = 0 whereas tan § is undefined. 


a Note that cosecant is the reciprocal of sine and secant is the reciprocal of 
cosine. 


We also need to discuss the inverses of the trig functions. Of course, they do not have inverses as we defined 
“inverse function” in Section 1.3, as their graphs pretty badly fail the horizontal line test. For example, 


sin0 = sin2n = sin4n =--- =0, 
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so we can’t explicitly determine sin” '(0)—there are infinitely many @ such that sin@ = 0. But we'll cheat and 
“define” sin“! by restricting the domain of the sine function to an appropriate interval. 


Definition: The inverse sine function, denoted sin“ x or arcsin x, is the function with domain [-1, 1] that 


satisfies sin“'(sin x) = x for all x € [-3, 3]. 


Thus, sin™! is the function that takes a value between —1 and 1 (inclusive) and outputs the angle in |—4, | 


whose sine is that value. Note that this is not technically an inverse of sine; however, it is the inverse of the 
function we get by restricting the domain of the sine function to [-3, I. 


We can define the inverse cosine function too, but we need to make a slight modification. 


Problem 1.22: Define cos™! in a similar fashion as we defined sin’. How do the domain and range differ? 


Solution for Problem 1.22: We want cos“! to have the same domain [-1, 1] as sin™', because the range of cosine is 
[-1,1]. However, the range of cos! can’t be |-3, x, since cosine only takes nonnegative values for angles in this 


interval. We need the range of cos“! to be an interval on which cosine takes all values in [—1, 1]. The most obvious 
choice is the interval [0, 7], so we have our definition: 


Definition: The inverse cosine function, denoted cos™! x or arccos x, is the function with domain [-1, 1] that 
satisfies cos~!(cos x) = x for all x € [0, x]. 


WARNING!! — Most calculus books use the two inverse trig notations interchangeably. In 
this book, we will frequently switch between sin“! and arcsin to get you used 
to seeing both notations. 


Naturally, we also have an inverse tangent function, with the appropriate modifications to the domain and 
range: 


Definition: The inverse tangent function, denoted tan” x or arctan x, is the function with domain R that 


satisfies tan™!(tan x) = x for all x € (-, 2), 


EXERCISES 
1.5.1 Determine the domains and ranges of sec, csc, and cot. 
15.2 


(a) Write sin(—@) in terms of sin 0. 
(b) Write cos(—@) in terms of cos 0. 


1.5.3 Show that tan(@ + 7) = tan @ for any 0 € R such that 6 # (k + 3) 7m for any integer k. 


1.5.4 Find sin (3 + ) and cos (3 + 0) in terms of sin @ and cos 0. 
1.5.5 Find the following quantities in terms of sin @ or cos 0: 


(a) cos(%-@) (b)  sin(x — 0) (c) cos(x + 6) 
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1.5.6 Prove that sin and cos each has no period smaller than 27. Is the same true for tan? 


1.5.7 Compute the following: 
(a) cos0 (b) sin’ } (c) cos™ (-¥) (d) tan7!(-1) 


1.5.8 Find an angle 6 such that cos~!(cos 9) = 9 but sin”!(sin @) # 0. Hints: 254 


1.6 BASIC TRIGONOMETRIC IDENTITIES 


In this section we'll prove some basic identities for the trig functions. The first identity is the most fundamental 
and also the most useful. 


Problem 1.23: Prove that, for any 0 € R, 


(sin 0)* + (cos 9)? = 1. 


Solution for Problem 1.23: We recall that, for any 6 € R, the point (cos 0,sin 9) is, by definition, on the circle 
centered at (0,0) with radius 1. But of course any point (x, y) on that circle satisfies the equation x* + y* = 1, and 
that proves our identity! 0 


By convention, we usually write powers of trig functions with a notational shorthand: sin” 6 denotes (sin @)* 
and similarly cos” 6 denotes (cos @)*. Thus, our fundamental identity becomes simply: 


Important: For any 8 € R, 


sin? 0 + cos? 6 = 1. 


It’s hard to overstate how fundamental this identity is to calculus, and we'll see it reappear time and time 
again throughout the rest of the book. 


Next, we have the angle addition and subtraction formulas. These are a bit difficult to prove (see Section 1.A 
for some further discussion), so we will just present them without proof: 


Important: For any a,B € R, 


sin(a + B) = sina cos f + cosasinB 


sin(a — 8) = sina cos — cosa sinB 
cos(a + B) = cosa cos B — sina sin B 
cos(a — 8) = cosacosB + sina sin B 


While these formulas in their general form are not especially useful for calculus, some specific applications of 
them are very important. In particular: 


Problem 1.24: Find simple formulas for sin 26 and cos 26 in terms of sin @ and/or cos 0. 


Solution for Problem 1.24: We can just use our angle-addition formulas from above. Specifically, 


sin 20 = sin(@ + @) = sin @ cos 8 + cos 0 sin 8 = 2sin @ cos 0. 
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Similarly, 
cos 20 = cos(@ + 0) = cos? 6 — sin? 0. 


The latter formula is sometimes simplified using sin* 6 + cos? @ = 1 to write it only in terms of sin or of cos: 


cos 20 = cos* 6 — sin? @ = 2cos* @-1=1-2sin’ 6. 


The formulas from Problem 1.24 are called the double-angle formulas: 


Important: Double-angle formulas: for any 0 € R, 


sin20 = 2sin@cos 96, 

cos 20 = cos? @ — sin? @ 
=2cos*@-1 
=1-2sin’ 0. 


As an exercise, you'll be asked to find formulas for sin $ and cos g. Not surprisingly, these are called the 


half-angle formulas. 


The formula in the next problem is also commonly used in calculus: 


Problem 1.25: Prove that tan? 9 + 1 = sec? @ for all @ € IR where both sides are defined. 


Solution for Problem 1.25: Our first step is a common trig-problem strategy: 


When dealing with a trigonometric expression, it is often helpful to write every- 


thing in terms of sines and cosines. 


We write the left side of our desired identity as 


sin 0 
cos? 9 


tan? 9+1= +1. 


Putting this over a common denominator gives us 


sin? 0 + cos? @ 
cos? @ 


But the numerator is just 1—remember, this is our most basic trig identity (from Problem 1.23). Thus, our 


1 ‘ ies 2 
cost 8’ which by definition is sec* 0. 0 


Finally, let’s see how some of these identities work in a nontrivial trig problem. 


expression is just 


Problem 1.26: Given that t € R satisfies (1 + sin t)(1 + cost) = 2, compute (1 — sin t)(1 — cos f). (Source: AIME) 


Solution for Problem 1.26: It’s not immediately clear how to start, so we can begin by assigning a variable to what 
we want to compute. 
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Concept: __ In general, when trying to evaluate a complicated expression, set some variable 
equal to the complicated expression. Then try to combine this new equation with 


the given data to solve for the variable. 


We let x = (1 — sin #)(1 — cos t) be the expression that we want. This gives us a system of equations: 
’ 5 
(1 + sinf)(1 + cos f) = 7 
(1 —sint)(1 — cost) = x. 


There doesn’t seem to be any good choice other than to multiply out the two left sides. 


5 
1+sintcost+cost+sint = 7’ 


1+sintcost — cost —sint = x. 


We notice that all of the left side terms are the same (but some with different signs), so adding and subtracting the 
equations will make some terms cancel. In particular, adding them gives 


2+ 2sintcost = > +x, (1) 


and subtracting them gives 


2eost +2sint = 2 — x (2) 


You might recognize the double-angle formula for sine in equation (1), and try to replace 2 sint cost = sin 2t. But 
that doesn’t really get us closer to our goal of solving for x. We want to make all the f terms cancel. Instead, 
a better idea is to square equation (2), because that will create sin? t and cos*t terms that we should be able to 
eliminate completely. Squaring (2) gives us 


4cos?t + 8sintcost + 4sin?t = 2 — 2x42. (3) 
This is good, because 4 cos? t + 4sin? t = 4. So (3) becomes 
ee ae (4) 


16 2 
Aha! Multiplying (1) by 4 and subtracting from (4) will cancel all the t terms! We get 


2) ‘5 5 
4-8=(2-3x+x)-4(2 +2). 


This is just a quadratic in x, which after simplification is 


Sine and cosine have ranges [—1, 1], and these ranges often put bounds on possible 
values of expressions defined in terms of sine and cosine. 
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We recall that we defined x = (1 — sint)(1 — cost). Each term on the right side has value in the interval [0,2], 


so we know that x € [0,4]. However, B + V10 >3+3 = 6is way too big. So we must have x = Y — V10 as our 
answer. 0 


EXERCISES 

1.6.1 Compute sin % and cos %. 

1.6.2 Find formulas for sin g and cos g in terms of sin 0 and/or cos 0. 
1.6.3 Find a formula for tan(a + B) in terms of tana and tanf. Hints: 151 


1.6.4 Find formulas for tan20 and tan g. 


1.7 ExPONENTIALS AND LOGARITHMS 


At this point, we have several classes of functions to work with: polynomial functions, square roots, and 
trigonometric functions. But there’s one more very important class of functions that we need. 


From your algebra and precalculus courses, you are familiar with exponentials and their inverses, logarithms: 


a=b & r=log(b). 


For calculus, we will need a deeper understanding of these sorts of functions. In order to gain this under- 
standing, let’s suppose that we were trying to construct an exponential function f from scratch. What properties 
should it have? We can try to answer this question by looking at a common exponential function that we should 
be very familiar with: 

f(x) =2. 


What properties does this have? 
Problem 1.27: Let f(x) = 2°. 
(a) What is f(0)? What is f(1)? 
(b) Write f(a + b) in terms of f(a) and f(b). 


(c) Write f(—a) in terms of f(a). 
(d) Sketch a graph of f. What are the domain and range of f? 


Solution for Problem 1.27: 
(a) fO)=2)=1end AD=2* =2. 
(b) This is a basic property of exponentials: 
f(a +b) = 2"*? = 2" = F(a) f(b). 


(c) Wehave 


1 1 


fra) = 2 = 55 = flay. 


We can also use the fact from part (b). We know that f(a + (—a)) = f(0) = 1, but on the other hand 
f(a + (—a)) = f(a) f(-a). So f(a) f(—a) = 1, and thus f(—a) = 1/ f(a). 
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(d) We sketch the graph of y = 2* at right. Note that from part (a), we know that the 
graph must pass through the points (0,1) and (1,2). 
This graph seems to imply that the domain of f is all of R and that the range 
of f is the positive real numbers, but are these statements really true? Certainly 
there’s no problem describing 2* if x is a positive integer: 


DF = OSD xian 2. 
_——— 
x terms 
1 
Further, if x is a negative integer we have 2* = are and we know 2° = 1, so 


2* is well-defined for any integer x. We can extend the function to reciprocals 
of nonzero integers by defining 2 (for any nonzero integer n) to be the unique 
positive real number b such that b" = 2. (The fact that such a number must exist 
and is unique is nontrivial to prove at this stage, so we will just assume that it exists.) Going further, we can 
extend the function to all rational numbers by letting 


2% =(2")1, 
for all integers m and positive integers n. But what, for example, is 2*? How do we define this? 


So, at this point, although we may believe that the domain of 2* is IR and the range is positive real numbers, 
all we can definitively say is that the domain of 2* contains Q. 


O 


Notice that there’s nothing special about the “2” in our above discussion—we could just have easily worked 
with f(x) = a* for any positive real number a. The issue is that although it’s straightforward to define a* when x is 
rational, it’s not obvious what to do when x is irrational, and we'd like to have a function a* whose domain is all 
of R. 


Let’s focus on what should be the properties of the function a*. To simplify things a little bit, let’s assume that 
a > 1. Then we want to have: 

e f(x) =a* has domain R and range (0, +00). 

e f(x) =a* is strictly increasing: if m <n thena” <a”. 


e a”*" = aa" for any real numbers m and n. 


Problem 1.28: How do we need to alter the above list if a = 1 or 0 < a < 1? 


Solution for Problem 1.28: If a = 1, then the function is just the constant function f(x) = 1* = 1. So the range is just 
the point {1}, and the function is constant, not increasing. 


If a < 1, then the function should be strictly decreasing instead of strictly increasing: that is, if m <n, then 
a” >a". The domain is still IR and the range is still (0, +00). This also follows from the fact that a* = +; = (1) for 


a 


all x € R, so the function a* is decreasing if and only if (1) is increasing. 
In any case, the function should satisfy a’"*" = aa". 0 


Using these properties, we can make an abstract definition of an “exponential” function: 


Definition: Let f be a strictly increasing, strictly decreasing, or constant function with domain RR that takes 


positive values. We say that f is exponential if f(x + y) = f(x) f(y) for all x,y € R. 
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We know that a° = 1 for any positive a, so we hope that our abstract exponential functions have this property 
too: 


Problem 1.29: Suppose that f is an exponential function. 
(a) Show that f(0) = 1. 
(b) Leta = f(1). Show that f(n) = a” for any integer n. 


Solution for Problem 1.29: We use a common technique: 


Concept: | When dealing with a functional equation, it often helps to plug in simple values 
O= == _ to the equation, such as 0 or 1. 


(a) Set y = 0 in the functional equation f(x + y) = f(x) f(y). This gives us f(x) = f(x)f(0). Since f takes positive 
values, f(x) is nonzero, so we can divide by it to get 1 = f(0). 
(b) We prove f(n) = a" for positive n by induction. Note that f(1) = a! = a by definition, and if f(k) = a‘ for any 
positive integer k, then 
f(k+1) =f FA) = @)@ =a". 
Thus f(n) = a” for all positive integers n. By part (a), f(0) = 1 =a’. Finally, for negative integers n, we have 


f(n)f(-n) = f(0) = 1,50 


O 


We can extend Problem 1.29, by essentially repeating the arguments from Problem 1.27(d), to show that 
f(x) = a for all rational x. However, this does not prove that f(x) = a* for values of x that are irrational. In fact, 
we haven't even defined a* when x is irrational. But f, by definition, has domain R, so what we do is define a* to 
equal f(x) for the exponential function f such that f(1) = a. 


There is one more fundamental exponential property that we will need, and that is 
g* = (a’)® 
for all r,s € IR. Unfortunately, this is quite difficult to prove using our current notion of exponential, so we will 


have to take it on faith (for now) that this is true, and defer the proof until Section 5.A. 


The point of this discussion is that, for any positive a, we can consider f(x) = a* 
as a function whose domain is all of IR, and that this function has all of the nice 
properties that we expect exponentials to have. 


In Chapter 5, we will be able to define very precisely an exponential function, and 


prove all of these properties. However, we introduce the exponential function 
now, because we'd really like to be able to use it in examples throughout Chapters 
2-4, even if we have to assume certain facts about the function that we cannot 
prove right now. 


There is one very special exponential function that we use throughout calculus. In fact, it is so special that it is 
usually called the exponential function. It is the exponential with the special base e. The number e can be defined 
as 
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There are several issues with this “definition.” First of all, it’s not at all clear how to define rigorously the above 
infinite sum, but we will discuss such matters in Chapter 7. Second, even if we assume that the above number 
exists—and it does, it is approximately 2.71828—what'’s so special about e? Unfortunately, we can’t say too much 
right now. You can see a little bit about why e is special in Section 1.A, but the real power of e will have to wait 
until Chapter 3. 


Definition: The exponential function, denoted exp(x), is the exponential with base e: 


exp(x) =e", 


Again, recall that this means that exp is the strictly increasing, positive-valued function satisfying exp(1) = e 
and exp(a + b) = exp(a) exp(b) for all a,b € IR. We usually write it to look like an exponential, so we would write 
the above properties as e! = e and e** = ee’. 


Right now, this is mostly a tease. We’ll see why exp is a really nice function in Chapter 3, but we won't be able 
to make the definition rigorous, or to prove facts about the exponential function, until Chapter 5. 


Exponential functions have inverses, called logarithms. For example, 
2*=16 © log, 16=4. 
More generally, if a is any positive real number and x € R, then 
@w=y @& log y=x. 
And if we let a be our special number e, then we have 
e=y © log,y=x. 
More abstractly, we defined the exponential function to be a strictly increasing function with domain R and 


range (0, +o). Because it is strictly increasing, it has an inverse (you will explain why as an exercise). This allows 
us to define our “natural” logarithm function. 


Definition: The natural logarithm function log : (0, +00) — R is the inverse of the exponential function; that 
is, for any x € (0, +00) and y € R, we have 


log(x)=y @ x=e!. 


Note that we think of log as being log,; that is, a logarithm with “base” e. 


WARNING!! Most calculus textbooks (and the AP Calculus Examination), and most cal- 
culators, use the notation In for the natural logarithm function. But most 
mathematicians use log, thus so shall we. 


Let’s verify that the log function has the properties that a good logarithm should have. 


Problem 1.30: 
(a) What is log(1)? 


(b) Show that, if a,b € (0, +00), then log(ab) = log(a) + log(b). 
(c) Ifr€ R, then what is log(a’)? 


32 


REVIEW PROBLEMS 


Solution for Problem 1.30: 


(a) Since exp(0) = 1, we have log(1) = 0. 
(b) Let A = log(a) and B = log(b). Then exp(A) = a and exp(B) = b, so exp(A +B) = (exp A)(exp B) = ab. Therefore, 


log(ab) = A + B = log(a) + log(b). 
(c) Let A = log(a). Then exp(A) = a, so exp(rA) = e" = (e4)" = (exp(A))’ = a". Therefore, log(a”) = rA = rlog(a). 
Oo 
So our log function behaves like a logarithm should. 


EXERCISES 

1.7.1 For practice, simplify the following. (The goal is for these computations to become nearly automatic.) 
(a) log(e°) (b) log( ve) (c) eles3 (a) (e~tos4))2 

1.7.2 


(a) Write log,(7) as an expression in terms of the natural log function. 
(b) Assume a and b are positive and a # 1. How do we write log, b in terms of natural log? 


1.7.3. Explain why if f is a strictly increasing function, then f must have an inverse function. 


REVIEW PROBLEMS 
1.31 Show that if A and B are two sets, then we cannot have both A c Band BC A. 
1.32 The symmetric difference of two sets A and B is 
A©B = {x | xis in exactly one of A or B}. 
(a) Show that symmetric difference is commutative: AG B=BOA 
(b) Show that symmetric difference is associative: (A 9 B)6@C =AS(BSC) 
(c) WhatisA©@? 
(d) Describe, in words, the set AG B&@ Ce D, and why it is valid to write this without parentheses. 


1.33 Suppose a < b <c < dare real numbers. Write each of the following as an interval if possible. If it is not an 
interval, explain why not. 


(a) (@,b) Ud) (b) (@,b)N (c,d) 

(c) (a,c) U (b,d) (4) @c)N(b,d) 

1.34 Find the domain and range of f(x) = V2-x=x2. 

1.35 Suppose that a function f has domain (—2,2) and range (—3,5). Find the domain and range (if possible) of: 
(a) f(2x) (b) f( vx) (c) f(x) (assuming it exists) 

1.36 Evaluate the following. (The goal is for these computations to become nearly automatic.) 

(a) tan% (b) cos # (c) sin % (d) csc # 
(e) cot 22 (f)  sin(msin(#)) (g) tan21n (h)  sec(-2) 
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1.37 Find the following quantities in terms of sin @ or cos 0: 


(a) cos(n-0)  (b) sin($-0) (© sin(%# +0) 


1.38 Compute sin 3%, cos 9% and tan 3%. Hints: 39 


1.39 Find the smallest positive @ such that sin 30 = cos 76. Hints: 87 
1.40 Find arctan (tan in), 


1.41 If x € [—1,1], write cos(sin™! x) more simply in terms of x. Hints: 267, 89 
1.42 Solve the following equations: 

(a) log,(25**) = 10 

(b) log, (x?) + log, (3x) = 16 

(c) e*—3e*-4=0 


CHALLENGE PROBLEMS 

1.43 

(a) If AUB =A, what is the relationship (if any) between A and B? 
(b) If ANB =A, what is the relationship (if any) between A and B? 
Hints: 132 


ax+b 
cx +d 


1.44 Let f(x) = for some real numbers a, b,c,d with ad # bc. 


(a) Find the domain and range of f. 
(b) Find the inverse of f if it exists. 
(c) Why is the condition that ad # bc important? What happens if ad = bc? 


1.45 Determine all 6 such that 0 < 6 < F and sin? @ + cos? @ = 1. Hints: 280, 174 
1.46 Compute sin(arctan 3). Hints: 232 


2 
1.47 Ifsin2x = 5s and cos x > sin x, then compute cos x — sin x. Hints: 60 


1.48 Compute tan! } + tan“! 4. Hints: 118, 237 


1.49 The following are the hyperbolic trig functions: 
e=—e= 


inh x = a 

sinh x 5) 
a: 

coshx = —** : 


(a) Prove that cosh’ x — sinh? x = 1 for all x € R. 

(b) Find a formula for sinh(x + y) in terms of sinh x, sinh y, cosh. x, and cosh y. Do the same for cosh(x + y). 

(c) Can you suggest why we use the name hyperbolic for these functions? Hints: 216 

1.50x There’s an interesting phenomenon called Russell’s paradox that occurs with sets: Let C be the set whose 


elements are sets that do not contain themselves as an element. Is C € C? Can you explain what's going on? 
Hints: 6 
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1.A RELATIONSHIP BETWEEN TRIGONOMETRIC FUNCTIONS AND EXPONENTIALS 


Two of the types of functions that we looked at in this chapter—trig functions and exponentials—are connected 
using complex numbers. Although we will not be using complex numbers in our study of calculus in this book, 
we will briefly discuss how complex numbers connect trig functions to exponentials. 


The set of complex numbers, denoted C, is the set 
C= {a+bil|a,be R}. 


The imaginary number i is defined to have the property that 2 = —1. In the complex number a + bi, the real 
number a is called the real part and the real number b is called the imaginary part. 


Complex numbers are added and multiplied as follows: 
(a + bi) + (c+ di) = (a+c) + (bi+ di) 
=(a+c)+(b+d)i, 
(a + bi)(c + di) = ac + bci + adi + bdi* 
= (ac — bd) + (ad + bc)i. 


It turns out that trig functions are related to exponentials of complex numbers via one of the most important 
equations of analysis: 


Important: — Euler’s Formula: 


e? =cos@+isind. — 


For example, letting 6 = 7 in Euler’s Formula gives 
e™ =cosn+isinn = —1+ i(0) =-1. 


This is often rewritten as 
eta =; 


and is considered by many to be the most elegant equation in all of mathematics, as it combines what are arguably 
the five most fundamental constants in math: 0, 1, 72, e, and i. 


A less trivial example is 


Notice that the left side cubed is 
(e3)P = et = a4. 


and we see that this is true when we cube the right side as well: 


+) -G) 9G) (F)-9@)(F) -() 


_1,3V8,_9 33 
8 8 8 8 
= -1. 
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One of the many wonderful things about Euler’s Formula is that it encapsulates the sine and cosine angle- 
addition formulas. If a,B € IR, then we have 


el(@+P) = cos(a + B) + isin(a + B), 
but on the other hand 


ear) = (eiay(elt) 
= (cosa + isina)(cos B + isin f) 
= (cos acos f — sina sin f) + i(sina cos B + cos a sin B). 


Comparing the real and imaginary parts of this with the Euler’s Formula expansion of e“*P) from above gives us 
the angle-addition formulas: 


cos(a + B) = cosacosf — sina sin B, 
sin(a + B) = sinacosf + cosasin fp. 


So if you remember Euler’s Formula, you don’t have to memorize the angle-addition formulas, since they are 
very easily reconstituted via the above calculation. 


Right now, this is magic. In Chapter 7, we'll see more clearly why Euler’s Formula is true. 
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LIMITS AND CONTINUITY 


2.1 Limits 


Limit is probably the calculus concept that students initially find the most confusing. Intuitively, a limit 
describes what happens as a function “approaches” a point in IR. We can get some idea of what a limit is by 
looking at an example. 


2-1 


x-1° 


Problem 2.1: Consider the functions f(x) = x + 1 and g(x) = 
(a) Determine Dom(f) and Dom(g). 
(b) Sketch the graphs of f and g. How do they differ? 


Solution for Problem 2.1: 


(a) Itis clear that Dom(f) = IR. However, 1 ¢ Dom(g) since y y 
that would make the denominator equal to 0; otherwise, 
gis defined. So Dom(g) = R \ {1} = (—09, 1) U (1, +09). 


(b) The graph of f is the line y = x +1, as in the figure on the y = f(x) y = g(x) 
left. The graph of g is also this line for all x + 1, because 
2 
if x # 1 then = = ; 
domain of g. Thus, there is a “hole” in the graph of g at 
(1,2), as in the figure on the right. 


= x+1. However, 1 is not in the 


O 


On the other hand, looking at the graph of g from Problem 2.1, it is “obvious” that the graph of g “approaches” 
the point (1,2). Slightly more precisely (though still pretty informally), the graph of g gets “arbitrarily close” to the 
point (1,2). Mathematically, we write this as lim g(x) = 2. This should be read as “the limit of g(x) as x approaches 

x=: 
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2 
at Sete : F fee ee St 
1 is 2.” We could also write it without referring to the name of the function, as lim = 2. 
x 


x= 1 


There are lots of different ways that we could try to define this precisely, and mathematicians over the years 
tried some different ideas. The idea that stuck is colloquially called the 5-e definition of limit. (The Greek letters 
6 (delta) and € (epsilon) are used throughout calculus in this context, so get used to them.) 


The definition is a bit confusing the first time you see it: 


Definition: Let f be a real-valued function. We say that the limit of f(x) as x approaches a is L, or 
lim f(x) = L, 


if, for all e > 0, there exists 5 > 0 such that if x is within 6 of a (with x # a), then f(x) is within e of L. We write 
this more precisely as 
0<|x-al<5d = (f(x)-Li<e, 


where the “=>” symbol means “implies”: 


If 0 < |x —al <6, then |f(x) -L| <e. 


This definition is much easier to understand by looking at the picture tothe Y an 
right. Intuitively, we think of lim f(x) = L as meaning that the graph of f comes 
x—a 
arbitrarily close to the point (a,L). What this means is: suppose someone tells 


you that you need to have f(x) within e of L. In other words, someone gives you Lt+e 
a value of € and insists that |f(x) — L| < e. Then, you have to be able to come 


up with a value of 6 such that all values of x within 6 of a—except possibly a L7e 
itself—have f(x) within e of L. This means you must choose 6 such that 
0<|x-al <6 => \f(x) -L| <e. 
To put it another way, the entire graph of f on the (horizontal) interval (a—6,a+6)— 
except possibly at a itself—must lie within the (vertical) interval (L — e,L + €). % 


In other words, you are given the width of the vertical region centered at L that we require the graph of the 
function to be contained in, and you have to determine a width of a horizontal region centered at a so that the 
graph of the function is contained entirely within the box near (a, L). For example, the diagram on the right below 
shows a smaller value of € than the diagram above; the value of 6 is also decreased to “trap” the function inside 
the box near (a, L). 


There are a couple of things slightly tricky about this definition. First,ithasto YY 
work for any positive value of €, but once € is given, we get to choose the value of 
6 that works. (Some sources even use the functional notation 5(e), to emphasize 
the fact that the value of 6 depends on the value of €.) Second, the value f(a) 


a-6 at+6 


doesn’t enter into this at all—in fact, a need not even be in the domain of f. Limit L+e 
is a concept that describes what happens as x approaches a, not what happens at a. Lre 
The definition of limit is a tricky concept. Don’t be discouraged 
if it doesn’t make sense right away. Hopefully, as we progress 
through the next several examples, the concept will become more 
clear. 
x 


Let’s practice with this definition by going back to our original example: 
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x-1° 


Problem 2.2: Consider the function g(x) = ==!. We wish to prove that lim g(x) = 2. 
x» 


(a) Ife =0.2 is given, what value of 5 can we choose to satisfy the definition? 


(b) Ife = 0.05 is given, what value of 6 can we choose to satisfy the definition? 
(c) Prove that lim g(x) = 2. 


Solution for Problem 2.2: We first note that g(x) = x + 1 for all values x € R \ {1}. Note that the definition of 
our limit does not concern itself with what happens when x = 1, because we only examine values of x such that 
0 < |x —1| < 6 for some 6 > 0, and in particular we don’t have to worry about x = 1. Thus, we can replace 9(x) 
with x + 1 when trying to prove the condition in the definition of limit, so g(x) = x + 1 for all values of x that we 
will be considering. 


Concept: The definition of lim f(x) does not concern itself with what happens at x = a, only 


O=e with what happens near x = a. 


With this idea, let’s proceed to the specific calculations: 


(a) Weneed to choose 6 such that 
0<|x-11<6 = _ |9(x)-2|< 0.2. 


But since x # 1, the condition on the right side of the implication above is equivalent to 
\(x + 1) — 2| < 0.2, 


or |x — 1| < 0.2. Thus, letting 6 = 0.2 will satisfy the condition. In other words, if x # 1 is within 0.2 (that is, 5) 
of 1, then f(x) will be within 0.2 (that is, €) of 2. 


(b) We need to choose 6 such that 
O0<|[x-11<5 =>  |g(x)-2| < 0.05. 


But since x # 1, the condition on the right side of the implication above is equal to 
\(x + 1) — 2| < 0.05, 


or |x — 1| < 0.05. Thus, letting 5 = 0.05 will satisfy the condition. 


(c) Our solutions to parts (a) and (b) were exactly the same argument with just the value of ¢ changed, so this 
argument should work for any value of €. Specifically, we need to choose 6 such that 


O0<|x-1)<56 = (|g9(x)-2| <e. 
But since x # 1, the condition on the right side of the implication above is equal to 
\(x + 1) —2| <e, 


or |x — 1| < e. Thus, letting 6 = e will satisfy the condition. 


It just so happened in Problem 2.2 that 6 = e worked. This will not usually be the case, as in the next example: 
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Solution for Problem 2.3: Let f(x) = 3x*. Since the graph of f is “smooth” and has no holes or skips, we expect that 
f(x) approaches f(2) as x approaches 2; in other words, we strongly suspect that 


lim Se? = fe) = 12. 
(Later in this chapter, we'll make this intuition more precise.) Let’s prove this limit using the 6-e definition. 
We are given some € > 0, and we need to find 6 such that 
O<|x-2)<6 = [8x?-12|<e. 


The inequality |3x* — 12| < e will be more useful if it is in terms of x — 2 rather than x, since the inequality 
0 < |x —2| < dis in terms of x — 2. For simplicity, let z = x — 2. Then we wish to find 5 such that 


O0<ki<56 => ([(B(z+2)?-12)<e. 


We can simplify this to 
O<[2l<5 = [827+12z|<e. 


However, we know that |3z? + 12z| < |3z?| + |12z| = 3z? + 12|z|. So it suffices to find 5 such that 
O<|[zl<5 => 3274+12\z2|<e. 
If 0 < |z| < 5, then 3z? + 12|z| < 36¢ + 126 = 36(4 + 5). Thus it suffices to choose 5 such that 
35(4 + 6) <e. 


The 4 + 6 term is somewhat annoying. We can make it much simpler by assuming that 6 < 1. 


Concept: All we have to do is find any 6 that works for our given e. So we can always make 
===  Osmaller than it needs tobe. 


If we assume that 6 < 1, then 4 + 6 < 5, and the inequality that we need becomes simply 


36(4 + 5) < 156 <e. 


To force this to be true, we select 6 = 7. (In the unlikely event that e > 15, we can just take 6 = 1.) We then 
conclude that 
O<[zl<5 => [B27 4+12z| < 35(44+6) < 155 =e. 


Thus, for any € < 15, we have found that 6 = te satisfies the 6-e condition: 
O<|x-2)<5 => ([8x*-12)<e, 
and hence we have established that lim 3x*=12.0 
xX. 


While performing the sort of calculation that we did in Problem 2.3 is useful to do once or twice in your life, 
in practice it’s not how we usually think about computing limits. We’ll have other, more useful techniques to 
compute limits that we'll develop later in this chapter and at other points in the book. 


Once again, we stress that the limit of f(x) as x approaches a does not depend at all on f(a). Let’s look at an 
extreme example of this: 


Problem 2.4: Consider the function 


ifx #0, 
f=| disco 


What is the value of lim f(x)? 
x0 
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Solution for Problem 2.4: Looking at the graph of this function, we see that this is just the graph of y = x, except 

the point (0,0) on the graph is replaced by the point (0,2). But as the function approaches x = 0, it is clear that the 

function approaches the point (0,0). Therefore, we should have lim = 0, since the function gets “arbitrarily close” 
<= 


to (0,0) as x approaches 0. 


Another way to look at this is to compare f to the function g(x) = x. These functions are equal at all of IR except 
for 0. Thus, since the limit doesn’t depend on what happens at x = 0, but only depends on what happens near 
x = 0, we have 

nae ange) ane 


Yet another way to see this is to appeal to the 5-e definition. For any e > 0, we can choose 0 < 6 < €, since 


O<|x-O)<5 = ([f(x)-Ol/ =|x-0| <5 <e. 


Once again, the important concept to take from Problem 2.4 is 


Important: | When determining lim f(x), the value of f(a) is irrelevant. 


In fact, f(a) doesn’t even have to be defined, as in Problem 2.2, where we computed lim — = 2, even though 
x7 = 


1 is not in the domain of this function. 


It may not immediately be clear that our definition gives us a unique limit. In other words, we need to check 
the following: 


Problem 2.5: Suppose lim f(x) =Land lim f(x) =L’. Prove that L = L’. 


Solution for Problem 2.5: Intuitively, if we had two different limits L and L’, there’s 
no way that the function could “approach” both points (a, L) and (a, L’) simultane- 
ously. 


The picture to the right gives us the idea of the proof. We want to pick € small 
enough so that the shown horizontal bands for the two limits don’t intersect. We 


can pick any € such that 0 < € < beat 


, which is possible because by assumption 
L # L’. The picture clearly shows that, no matter how close x is to a, we cannot 
have f(x) within e of both L and L’, because the horizontal bands in the picture 
don’t overlap. A picture, however, is not a proof, so let us prove this rigorously 
using the 6-e definition of limit. 


A picture may give you intuition, but it is never a substitute for a rigorous proof. 


We’ll now prove the result rigorously using the definition of limit. We suppose that L # L’ and try to derive a 


IL > | . Since L is a limit, we can find 6 such that 


0<|x-al<6 => (|f(x)-L<e. 


contradiction. As above, let e be such that 0 < e < 
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But since L’ is also a limit, we can find 6’ such that 
O<[x-al<i => If(x)-L'|<e. 


This means if we pick x close enough to a, then both | f(x) —L| < e and | f(x)—L’| < e will be simultaneously satisfied; 
that is, 
0 <|x-al < min{6,5'} => [f(x)-L| <eand|f(x)-L'|<e. 


We write the first inequality on the right as |L — f(x)| < €, and add the two inequalities on the right. We get 
IL — f(x)| + f(x) - L'| < 2e. 
Then, applying the Triangle Inequality (which states that |y + z| < |y| + |z| for all y,z € IR) gives us 
IL —L’| = |(L— f(x)) + (f(&) - L’)| s IL— f@) + Ife) - L' < 2e, 


IL — L'| 
zZ 


L—-L’ ; ; oe ‘ 
so aad < e. But this contradicts our original choice of € < 


, 80 we have our contradiction, and thus the 
limit must be unique. 0 


When they are defined, limits have very nice properties. We’ll prove one, and then list a bunch of others that 
are similar (some of which you will prove as exercises). 


Solution for Problem 2.6: We would be shocked if the answer were not L + M. Let’s prove that it is L + M. As usual 
when trying to rigorously prove facts about limits, we appeal to the 6-e definition. 


We want to show that if given any € > 0, we can find 6 > 0 such that 
O0<|x-al<6 => (|(f+9)(x)-(L+M) <e. 


This suggests constructing the similar inequalities for f and g separately using 5, and adding them together, as 
we shall see. 


Specifically, because we have lim f(x) = L, we can choose 5¢ such that 
x—a 
O<|x-al<6r = If(x)-Li< = 


Similarly, because we have lim g(x) = M, we can choose 6, such that 
x—a 


O<|r-al<dg > IgX)-MI<5. 
So if we let 6 = min{5¢, 5g}, then when 0 < |x — al < 6, both |f(x) — L| < § and |g(x) — M| < § above are satisfied, so 
we have A 5 
0<|x-al<6 = (f(x)-Li< 5 and |9(x) — M| < 5 
When we add the two inequalities on the right side and apply the Triangle Inequality, we get 
€ € 
I(f + g)@) - (L+ MI < f(a) - Ll + Ig@) -MI< 5 +5 =e. 
Thus we have satisfied the 6-e definition for lim(f + g)(x) =L+M.0 


There are other similar properties; we will leave the proofs as exercises. 
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Important: 


Let f and g be functions with lim f(x) = Land lim g(x) = M. Then: 
Rs lim(f + g)(x) =L+M 
2. lim(fg)(x) = LM 
3. Ifc € R, then lim(cf)(x) Ses 

. IfM #0, then lim(f/g)(x) = L/M 


Now we see that we can easily determine the limit of many “nice” functions by breaking them up into their 
constituent parts. For example, revisiting Problem 2.3, we quickly see that: 


lin Ox? 23 (tim 2) 
x72 x2 


2 
=3 (tim x) 
x2 
= 3(2)° = 12. 


Problem 2.7: 
(a) Suppose that f isa real-valued function such that f(x) > 0 forall x € Dom(f), and suppose that lim f(=L 


for some a € R. Show that L > 0. 
(b) In part (a), now suppose that f(x) > 0. Can we conclude that L > 0? 


(c) Suppose that f and g are real-valued functions such that f(x) < g(x) for all x € IR. Show that 
lim f(x) < lim g(z), 


provided both limits are defined. 


Solution for Problem 2.7: 


(a) To prove this, we proceed by contradiction and assume that L < 0. Pick € > 0 sufficiently small so that 
L+e <0 (for example, we could take e = —L/2). Then, by the definition of limit, there must exist 5 such that 


O0<|x-al<5 = If(x)-Li<e. 


But | f(x) — L] < e means f(x) € (L—e,L +), which, by our choice of €, means f(x) < 0, a contradiction. So we 
must have L > 0. 


(b) A function that is strictly positive can still approach a value of 0; that is, we might have f(x) > 0 for all x € R 
and yet have some value of a such that lim f(x) = 0. A simple example is f(x) = |x| for all x € IR \ {0}. Note that 


we are explicitly excluding 0 from the domain of f, so that f(x) > 0 for all x € Dom(f). But clearly lim |x| = 0, 
x= 
so we only have L > 0, not L > 0. We could also construct an example with Dom(f) = IR; for example, 


f(a) = { iohon 


Note that f(x) > 0 for all x € R, but lim f(x) = 0: 
x 
(c) Consider the function g — f, so that (¢g — f)(x) = 0 for all x € R, and use part (a). That is, 
0 < lim(g — f)(x) = lim g(x) — lim f(x), 
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so we conclude that lim f(x) < lim g(x). 
x—a x—a 


Along the lines of Problem 2.7 is an important result called the Squeeze Theorem: 


The Squeeze Theorem: Let a € R and let f, g,h be real-valued functions such 
that f(x) < g(x) < h(x) for all x in an open interval containing a (but not 
necessarily at a itself). If 


lim f(x) = lim h(x) = L, 


then lim g(x) = Las well. 


We can see what's going on in the picture to the right. The function g is trapped y 
between the function f below and the function h above. Since f and h both approach the 
point (a, L), and the function g is “squeezed” between them, we conclude that the function h 
g must approach (a, L) as well. Note that the Squeeze Theorem tells us both that the limit 
must exist and that it must equal L. The proof of the Squeeze Theorem uses the result 
from Problem 2.7; we will leave the details of the proof as an exercise. 


We'll continue our initial exploration of limits with a classic nontrivial example thatis (aL) 


quite important for calculus: 


R 


Problem 2.8: Compute lim one 


Solution for Problem 2.8: Of course, the function is undefined at 6 = 0. You can make a guess of the limit by 
computing (via calculator) “2 for values of @ close to 0. You can also make a guess by using your graphing 
calculator to graph y = ““* and observing where the graph appears to cross the y-axis. Doing either of these 
should convince you that the limit should be 1. 

We can see geometrically what’s going on by looking at the picture to the right. Shown 
is a unit circle with a central angle 0 drawn. By definition, the length of the bold arc is 0 
and the length of the vertical line segment is sin 9. As @ decreases towards 0, the lengths 


of these two curves grow closer together, as in the bottom picture. It is geometrically 
“obvious” that the ratio (sin 0)/0 is approaching 1 as the angle 6 approaches 0. 


To prove rigorously that lim one = 1, we will use the Squeeze Theorem. Specifically, 
we are looking for functions f(@) and h(@), both with limits of 1 as @ — 0, such that 


sin 0 
e) 


To do this, we’ll extend our unit circle diagram a bit, and look at the areas of the shaded 
regions in the pictures below: 


f(@) < < h(0). 


1 x 


ad 
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y y y 
tan @ 
1 is 1 x 1 - 


The shaded region in the left picture is a triangle with base 1 and height sin 9, so its area is (sin @)/2. The shaded 
region in the middle picture is a sector of a circle of radius 1 with central angle 0, so its area is 0/2. The shaded 
region in the right picture is a triangle with base 1 and height tan @, so its area is (tan @)/2. Thus, comparing the 
areas, we have 


or more simply, 
sin 6 < 6 < tané. 


Taking reciprocals (which we can do since we are only considering small @ > 0) reverses the direction of the 
inequalities, so we have 

oo ee ee 

tand @  sin@” 
Then, multiplying by sin @ (which is positive), and noting that £29 = cos 0, gives us 


cos 6 < mos <1. 
é 
Our calculation only establishes this inequality chain for 0 < @ < 3, but since all the terms are the same when we 
replace 6 with —8, it is true for all nonzero 6 € (-3, 2), Furthermore, our above argument shows that 
-@0<sin0d<@ 


for values of 0 close to 0, and hence 
cos@ = V1-sin?@ > VI- 6 >1-6?, 
so we have 
1-6? <coso < BF <1 
for values of 0 sufficiently close to 0. Thus, by the Squeeze Theorem, since lim(1 -@)= lim 1 = 1, we conclude 
that lim 228 <4. 
a0 @ 


Often, when considering lim f(x) for some function f, it is useful to look only as x approaches a from one 


side—that is, we only look at the behavior for x < a or for x > a. We illustrate the basic idea with the following 
example: 
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Problem 2.9: Consider the function 


CIF X= 0, 
fea) ={ x+1lifx>0. 


(a) Show that lim f(x) does not exist. 


(b) What would be a reasonable answer to the question “what is the limit as f(x) approaches x = 0 from the 
left?” What if we replace “from the left” with “from the right”? 


Solution for Problem 2.9: 


(a) The graph of y = f(x) has a “jump” at x = 0. It certainly does not appear that a limit y 
exists at x = 0. To prove this, suppose that lim f(x) = L, and we will attempt to show 
x—a 
a contradiction. If the limit exists, then for any € > 0, there must exist some 6 > 0 such 
that 


O0<|[xi<6 => (lf()—-LU<e. 


Choose some x such that 0 < x < 6. Then, since f(x) = x + 1 and f(—x) = —x, we must 
have 


Ix+1—-Ll<e and |L+x|=|-x-L|<e. 
Adding these and applying the Triangle Inequality gives 
Jl+2x|<2e => 1<2e. 


But this must be true for any value of e, and clearly it cannot be true for € < }. This is our contradiction, so 
the limit cannot exist. 


(b 


~~ 


When we look at values of x to the left of 0, and ignore the values of x to the right of 0, we see that the graph 
of the function f(x) approaches the point (0,0). So it makes sense to say “the limit of f(x) as x approaches 0 
from the left is 0.” 


Similarly, when we look at values of x to the right of 0, and ignore the values of x to the left of 0, we see 
that the graph of the function f(x) approaches the point (0,1). So it makes sense to say “the limit of f(x) as x 
approaches 0 from the right is 1.” 


We’ll denote the so-called one-sided limits from Problem 2.9 with the notations 
lim f(x») =0 and lim f(x) =1. 
x—0- x—0* 


“ew 


The superscript in 0 means to consider only values to the left of 0 (that is, less than 0), and the superscript 
“+” in 0* means to consider only values to the right of 0 (that is, greater than 0). 


To make a formal 6-e definition of one-sided limits, we restrict the values of x under consideration to those on 
one side or the other: 
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Definition: 
(a) We say that f(x) has limit L as x approaches a from the left, denoted 


lim f(x) = L, 


if, for every € > 0, there exists 6 > 0 such that 


O0<a-x<d = (f(x)-Li<e. 


(b) We say that f(x) has limit L as x approaches a from the right, denoted 
lim f(x) = L, 


if, for every € > 0, there exists 6 > 0 such that 


O<x-a<id = (|f(x)-LI<e. 


Note that all we’ve changed are the values of x that we have to consider. When we approach from the left, we 
only consider x such that 0 < a — x; that is, x < a. When we approach from the right, we only consider x such that 
0 <x -—a; that is, x > a. 


Let’s verify that these are weaker conditions than the regular definition of limit. 


Problem 2.10: Let f be a real-valued function. Show that 
lim f0) = tim fa) = 


if and only if lim f(x) = L. (In other words, if the two one-sided limits agree at a point a, then the limit at a 
exists and is equal to the one-sided limits, and vice versa.) 


Solution for Problem 2.10: This is primarily a matter of chasing the definitions. Suppose the two one-sided limits 
each equal L. Then, by definition, for any e > 0, we can find 5_ such that 


O<a-x<6. = (|f(x)-Li<e, 


and we can find 6, such that 
O<x-a<6, = (|f(x)-L<e. 


We can satisfy these simultaneously by letting 6 = min{6_,6,}. Then the two statements combine into one: 
O<|x-al<5d => (|f(x)-Li<e. 
This is the definition of lim f(x) = L, so we have finished the proof in this direction. 


We will leave the proof of the other direction—that the existence of the limit implies the existence of the two 
one-sided limits—as an exercise. 0 


EXERCISES 
2.1.1 Give an example of a function f in which 0 € Dom(f) but where lim f(x) is undefined. 
x 


2.1.2 Show, using the 6-e definition, that for any a € R, limx = a. Hints: 288 
x—a 


2.1.3 Prove that if lim f(x) = L and c € R, then lim(cf)(x) = cL. 
x—a x—a 
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2.1.4 The greatest integer function, denoted f(x) = |x], is the function in which f(x) equals the greatest integer 
less than or equal to x. Show that 
lim |x] = 1 and lim |x] = 2. 
x—2- x—2+ 
2.1.5 Prove that if lim f(x) = L, then 
x 
lim f(x) = lim f(x) =L. 
xa ate: 


2.1.6x Show that lim(f g(x) = (lim f (»)) (tim s(x)), assuming all terms are defined. (Caution: this is harder than 
it looks.) Hints: 298, 51 


2.1.7x Prove the Squeeze Theorem. Hints: 277, 62 


2.2 CONTINUITY 


Intuitively, a function is continuous if its graph can be drawn without lifting your pen from the paper. What 
this means is that the graph has no holes or jumps. Naturally, in order to rigorously use the notion of continuity, 
we'll need a more precise definition than “no holes or jumps.” At first, it seems somewhat tricky to state exactly 
what “no holes or jumps” means. But it’s actually pretty easy: we use limits. Continuity means that if the graph 
of a function approaches a point, then the graph actually hits that point; it doesn’t skip it or jump away at the last 
minute. 


Definition: Let f be a real-valued function. We say that f is continuous at a point a € Dom(f) if 


“lim f(x) = fa. 


If f is continuous at every point in its domain, we say that f is continuous everywhere (or simply f is 
continuous). 


However, we need to modify this definition slightly if a is on the boundary of Dom(f). In this case, instead 
of the limit, we only need use a suitable one-sided limit. For instance, if Dom(f) = [a,b], then we say that f is 
continuous at a if lim f(x) = f(a)—since the function isn’t defined to the left of a, we just use the one-sided limit 

x—at 


as f approaches a from the right. Similarly, we say that f is continuous at b if lim f(x) = f(b), since we only care 
x—b- 


what happens as f approaches b from the left. 


We'll start with some simple examples to show that continuity is not quite as mysterious as it may seem. 


Problem 2.11: 
(a) Prove that f(x) = c, where c € Ris a constant, is continuous. 


(b) Prove that f(x) = x is continuous. 


Solution for Problem 2.11: 


(a) Leta € IR. We wish to show that lim f(x) = f(a) = c. So, using the definition of limit, we let e > 0 be given. 
x—a 


We need to choose 6 so that 
0<|x-al<6 = If(x)-f@l<e. 


But since f(x) = f(a) = c for all x, the inequality on the right is just 0 < e, which is true for any value of 5. So 
the constant function is continuous. 
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(b) Leta € R. We wish to show that lim f(x) = f(a) = a. Again, we let e > 0 be given, and we need to choose 
6 > 0so that 
0<|x-al<6 => (|f(x)-f@l<e. 


But the inequality on the right is just |x — al < e, so choosing 5 = e€ does the trick. Thus lim f(x) =a and the 
function is continuous. 


O 


One thing to note about continuity is that it is a local property. What this means is that, for any a € R, if two 
functions f and g are equal on an open interval containing a, then either both f and g are continuous at a or both 
are discontinuous at a. This is because they have the same value at a and the same limit at a (if these exist). If I is 
an interval, we write f|; = gl; to mean that f and g agree on J; that is, f(x) = g(x) for all x € I. (The notation f|; is 
read “ f restricted to I.”) 


This localness property saves us some time when dealing with some more complicated functions, as in the 
next problem. 


Solution for Problem 2.12: The graph of y = |x] is shown at right. Just by observing 
the graph, we see that it “jumps” at every integer, so we suspect that the function is 
continuous at all non-integer values and discontinuous at the integers. Let’s try to 
prove this. 


If a € Ris not an integer, then the function f(x) = |x] equals the constant function 
g(x) = |aJ ona sufficiently small open interval centered at a. Since g is continuous at a 
(because it is a constant function), so is f. 


On the other hand, if a € Z, then f(a) = a, but the limit of f(x) as x approaches a 
does not exist, since 


lim f(x)=a-1 but lim f(x) =a. 


Since the two one-sided limits are not equal, the limit cannot exist. 0 


Next, let’s state an important basic observation about continuous functions: 


Important: Suppose that f and g are continuous at a, and c € R. Then f + g, fg, and cf are 


all continuous at a. Further, if g(a) # 0, then f/g is continuous at a. 


All of these facts follow from the equivalent facts for limits. For instance, if f and g are continuous at a, then 


lim(f + g)(x) = (lim f(2)) + (lim g(2)) = f(a) + g(@) = (F + 9), 


so f + g is continuous at a. In particular, this means that, in light of Problem 2.11, all polynomial functions are 
continuous on all of R, and all rational functions (quotients of polynomials) are continuous at all points in their 
domains. It is also true, but harder to prove, that the trigonometric, exponential, and natural logarithm functions 
are continuous. (We will prove that sine and cosine are continuous as an exercise.) 


The most important feature of a continuous function is the property explored in the following problem: 
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Problem 2.13: Suppose that f is continuous on all of R and that f(0) = 0 and f(2) = 3. 
Must there be a point c € (0,2) such that f(c) = 1? Explain briefly why or why not (don’t expect to be able 
to prove this rigorously). 


Must there be a point c € (0,2) such that f(c) = 4? Explain briefly why or why not. 
As generally as you can, describe f([0,2]). 
Why is the assumption that f is continuous necessary? 


Solution for Problem 2.13: 


(a) Atright we sketch the graph of a continuous function f that contains the points (0, 0) y 
and (2,3). The line y = 1 is dashed horizontally. We can see, intuitively, that there’s (2,3) 
no way for the graph to go from (0,0) to (2,3) without crossing the line y = 1 at least 
once. At a point (c, 1) where the graph crosses that line, we have f(c) = 1. 


Of course, this is not a rigorous proof, but we’ll get to that in a moment. 


(b) We can do the same sketch as in part (a), but with the line y = 4 dashed horizontally. 
Now it’s clear that we can draw a graph from (0,0) to (2,3) without hitting y = 4. 
Indeed, the most simple example is where f(x) = 3x; in this case, the graph is a 
straight line and does not intersect y = 4 at a point with x € (0, 2). 


(c) The only thing that was special about the value of f(c) in part (a) was that it was 
between 0 and 3. In other words, if 0 < y < 3, we know (via the same reasoning as 
in part (a)) that there is some value of c such that f(c) = y. This, combined with the 
values of f at the endpoints 0 and 2, gives us [0,3] € f([0,2]). Of course, it’s possible 
that f([0,2]) might be strictly larger that [0,3]. In fact, the most general thing that we 
can say is that f([0,2]) is a closed interval that contains [0,3], but the proof is rather technical and we will get 
to it a bit later. 


(d) If f is not continuous, as in the picture at right, then f can have all sorts of gaps and 
jumps, and there’s no guarantee that any particular value will be in the range. To 
take an extreme example, we might have 


Oifx <1, 
fla) =| 3ifx>1. 


Note that this function (shown at right) is discontinuous only at a single point, x = 1, 
but that f([0,2]) only contains the two points 0 and 3; that is, f({0,2]) = {0,3}. 


O 
As we saw in our exploration of Problem 2.13, the key feature of a continuous function is that it doesn’t skip 


any points. This is more formally stated as: 


Important: The Intermediate Value Theorem: If f is a continuous, real-valued function 
Vv defined on an interval [a,b], with f(a) # f(b), and y is a real number between 


f(a) and f(b), then there exists some c € (a,b) such that f(c) = y. 


Intuitively, the Intermediate Value Theorem means that every real number between f(a) and f(b) (so-called 
“intermediate values”) must be the image of some real number between a and b. In other words, f can’t “skip” 
any values. As we saw in Problem 2.13(d), it is vital that f be continuous in order to apply the Intermediate Value 
Theorem. 
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We will defer the proof of the Intermediate Value Theorem to Section 2.A. It is possibly the most technically 
difficult proof of all of the calculus topics that we will study in this book. 


There is another result that is closely related to the Intermediate Value Theorem, but to prove it, we first must 
establish the following theorem, which is interesting in its own right. This result states that continuous functions 
are bounded on closed intervals. 


Important: | The Boundedness Theorem: Suppose f is continuous ona closed interval [a, b]. 
Vv Then there exists some value M € R such that f(c) < M for all c € [a,b]. (That 


is, M is an upper bound for f on the interval [a, b].) 


WARNING!! This result is not true for open intervals (a,b). For example, consider the 
~ function f(x) = 2 on the interval (0,1). This function is continuous on (0,1) 
but gets arbitrarily large at x is close to 0, and thus does not have a maximum 

on (0,1). 


We can intuitively see why the Boundedness Theorem must be true: if we try to draw a graph in which f is 
continuous yet grows without bound in [a, b], starting at x = a, then we can’t “go off to infinity” and yet still reach 
x = b. Of course, this is hardly a convincing argument. Unfortunately, the proof is quite technical, and we will 
defer it to Section 2.A as well. 


Understanding (in a general sense) why the Intermediate Value Theorem and 
the Boundedness Theorem are natural consequences of continuity is vastly more 
important for calculus than learning the nuts and bolts of the proofs. So you 
should not worry if you skip Section 2.A or if you don’t really follow all the 
details in that section. This is a part of calculus where seeing “the big picture” is 
a lot more important than worrying about all the gory details. 


The Boundedness Theorem allows us to prove the following important fact about continuous functions on 
closed intervals: 


Problem 2.14: The Extreme Value Theorem: Suppose f is continuous on a closed interval [a,b]. Show that f 


attains a maximum on [a,b]; that is, there exists a real number c € [a,b] such that f(c) > f(x) for any x € [a,b]. 


Solution for Problem 2.14: By the Boundedness Theorem, we know that f([a,b]) has an upper bound, and thus, by 
the completeness property of R, it has a least upper bound. Call this least upper bound M. We need to show that 
M is the image of some point c € [a,b]. We will prove this by contradiction, so assume that M is not in the image 
of f([a,b]). Then the function g(x) = M — f(x) is continuous and strictly positive on all of [a,b]. But more than that: 
because M is the least upper bound, we know that gO, €)) # @ for all € > 0, because otherwise M — € would be 
an upper bound for f({a,b]). 


Since ¢(x) # 0 on [a, b], the function h(x) = 1/9(x) isa continuous function on [a, b]. However, h(x) is unbounded, 
since for alle > 0, there is some x such that g(x) < e, and hence + isnot an upper bound for h. But the unboundedness 
of h contradicts the Boundedness Theorem. 


Therefore, M must be in f([a,b]), as desired. 0 


Naturally, the Boundedness Theorem and the Extreme Value Theorem work just as well for lower bounds and 
minimums—we can simply replace f with —f, since a lower bound for f is —1 times an upper bound of —f, and 
vice versa. 
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Sidenote: _ It’s possible to construct a function that’s not continuous anywhere on its domain. 
For example, consider the function 


ot Sie, 
fod=| o4ee 


Since the value of the function changes “infinitely frequently” between 0 and 1, 
we suspect that the function does not have any limits. Indeed, for a limit at a point 
a to exist, then for any € > 0 we must be able to find 6 > 0 such that 


0<{x-al\<6 =>  |f(x)-LU<e. 


But no matter how small we choose 6, there will be both rational and irrational 
values of x such that |x — a| < 6. Thus, we must simultaneously satisfy 


ljO-Ll<e and |1-L|<e. 


Adding these means 1 < 2e, soife < 3, we cannot possibly satisfy these inequalities 
simultaneously. Therefore, lim f(x) does not exist at any a € R, and hence f is 


discontinuous at all points of R. 


EXERCISES 

2.2.1 Show that f(x) = |x| is continuous. 

2.2.2. Construct a function with domain R that is continuous at all points in IR except 0 and 2. 

2.2.3 Prove that if f and g are continuous at a, then so is fg. What about f/g? 

2.2.4 Show that if f is continuous on [a,b], then f attains a minimum on [a, }]. 

2.2.5x Suppose that f is continuous on [a,b] and f([a,b]) C QO. What can we conclude about f? Hints: 227 


2.2.6x We prove that sine and cosine are continuous as follows: 


2 Simx rae ee ; C 
(a) Use the fact that lim a ae 1 to show that sin x is continuous at x = 0. Hints: 90 
P oe J 


(b) Use the fact that sin? x + cos? x = 1 and part (a) to show that cos x is continuous at x = 0. Hints: 26, 295 


(c) Use the angle-addition formulas to show that sine and cosine are continuous, by showing that 


lim sin(x +h) = sin(x) and lim cos(x + h) = cos(x). 
| 4—| 


227k 


(a) Show that every polynomial of odd degree has at least one real root. Hints: 65 


(b) Is there an analogous statement for polynomials of even degree? 


Sidenote: 


The Fundamental Theorem of Algebra states that a polynomial of degree n has at 
most n real roots. This is not that hard to prove—try to prove it yourself! The more 
general version of the Fundamental Theorem of Algebra states that a polynomial 
of degree n has exactly n complex roots, counting multiplicity. However, this is 
quite difficult to prove. 
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REVIEW PROBLEMS 
2.15 Use the d-e definition of limit to prove that lim r = 8. 
x=, 


2.16 Compute the following: 


x-1 2)?-4 1= 
© tin mE So tim 
(a) lim Vx? (ce) lim(@-Lx) (f) lim (x - Lx) 


2.17 Find an example of a function f(x) with domain R for which lim f(x) exists but lim f(x) does not exist. 
x—0* x0" 


2.18 Suppose that f has domain R and is continuous. Prove or disprove: if the range of f contains both positive 
and negative numbers, then it must contain 0. 


2.19 Find a function f with domain R that is continuous on all of IR except —1, 0, and 1. 
2.20 


(a) Let f be a function such that lim f(x) = L. Prove that lim f (cx) = L for any nonzero constant c. 
x— x 
(b) Use part (a) to compute lim ==, where a is a nonzero real number. 
Xo 


.. sinax 
(c) Use part (b) to compute lim ait where a and b are nonzero real numbers. 
x 


2.21 Show that f is continuous if and only if for all a € Dom(f) and all € > 0, there exists 6 > 0 such that 


Ix-al<56 => (|f(x~)-f@l<e. 


CHALLENGE PROBLEMS 
2.22 Suppose f is a function with domain R such that | f(x)| < |x| for all x € IR. Prove that f is continuous at 0. 


2.23 Suppose we define f(x) = sad 


, but where x is in degrees, not radians. Find lim F()- 
x 


2.24 


(a) Assume that lim f(x) = L. Prove that lim kr i= ki. 
x x 
(b) Find an example where lim f(x?) exists but lim f(x) does not exist. 
x x 


2.25 Suppose that f is a continuous function. 

(a) Show that if J is an interval with I C Dom(f), then f(J) is an interval. 

(b) Show that if I is a closed interval with I C Dom(f), then f(J) is a closed interval. 
(c) Find examples of noncontinuous functions such that (a) and (b) are not true. 


2.26 Let f be a continuous function with domain [0,1] and range [0,1]. Prove that f must have a fixed point: 
that is, there exists some a € [0,1] such that f(a) = a. Is the result still true if the domain and range are (0,1)? 
Hints: 278 
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2.27x Let f and g be functions with domain R. Suppose 
lim f(x) =b and limg(x)=c. 
xa xb 
Prove or disprove that we must have lim(g o f)(x) = c. (If true, explain why, with a rigorous proof if possible; if 
xa 
false, give an example.) Hints: 23, 170 
2.28x Consider the function 


_f Oifx¢€Q, 
fe)= : if x = : € Q, with p,g € Z,q > 0, and p,q relatively prime. 


Show that f is continuous at every irrational point and discontinuous at every rational point in IR. Hints: 163, 185 


2.A PROOFS OF SOME CONTINUITY RESULTS 


We prove two important but technical results about continuous functions defined on closed intervals. 


Important: The Intermediate Value Theorem: If f is a continuous, real-valued function 
Vv defined on an interval [a,b], with f(a) # f(b), and y is a real number between 
f(a) and f(b), then there exists some c € (a,b) such that f(c) = y. 


We prove the theorem in the case that f(a) < f(b). (The proof in the case with f(a) > f(b) is nearly identical.) 
Let y be given with f(a) < y < f(b). We want to find some c € (a,b) such that f(c) = y. Define 
U = {x € [a,b] | f(x) < y}. 


In other words, U is the subset of [a,b] that f maps to values less than or equal to y. Since U is a nonempty 
bounded subset of IR (it’s nonempty since a € U), it has a least upper bound. Let c = sup U. We will prove that 


f(c) = y. 


First, we show that c # a. Choose € such that f(a) < y—e < y. (For example, we could take € = (y — f(a))/2, 
since f(a) < y.) Then, since f is continuous at a, we can choose 6 > 0 such that 


O<x-a<6 => (/f@—-fm<e. 


But then f(x) < f(a) +e < y, sox € U, and thus [a,a + 5) C U. This means than any upper bound, and in particular 
the least upper bound, must be at least a + 6; specifically, c > a + 4, and in particular c > a. 


Next, we show that c # b, using a similar argument. Choose € such that y < y+ € < f(b). (For example, we 
could take e = (f(b) — y)/2, since y < f(b).) Then, since f is continuous at b, we can choose 6 > 0 such that 


0<b-x<6 = If(x)-fI<e. 


But then f(x) > f(b) -—e > y,so x ¢ U, and thus (b — 6,b] NU = 0. But this means that b — 6 is an upper bound of U, 
and hence c < b. 


Recall our goal is to show that f(c) = y, where c = sup U. Let z = f(c); we will show that z = y by showing that 
both z < y and z > y lead to contradictions. 


If f(c) = z < y, then pick any € such that z < z+ € < y. Since f is continuous, we have lim f(x) =z, so there is 


some 6 such that 
0<|x-cl<6 = ([f(x)-21<e. 


54 


2.A. PROOFS OF SOME CONTINUITY RESULTS 


Pick 65 satisfying the above to be small enough so that c + 6 < b. (We can do this since we've already proved that 
c <b.) Let c’ be any real number such that c < c’ <c + 6. Then |f(c’) —z| < €, so f(c’) < z+€ < y. This means that 
c’ € U, which contradicts the fact that c is an upper bound for U. 


On the other hand, if f(c) = z > y, then pick any e€ such that y < z—e < z. Since f is continuous, we have 
lim f(x) = y, so there is some 6 such that 
O0<|x-cl<5 => (|f(x)-2I/<e. 


Pick 5 small enough so that a < c — 6, and let c’ be any real number such that c — 6 < c’ < c. Then |f(c’) —2z| <e, 
giving f(c’) > z—e > y. But then c’ is an upper bound for U as well, contradicting the fact that c was the least 
upper bound. 


Thus we must have f(c) = y, proving the theorem. 


Important: | The Boundedness Theorem: suppose f is continuous on a closed interval [a,b]. 
Then there exists some value M € R such that f(c) < M for all c € [a,b]. (That 


is, M is an upper bound for f on the interval [a, b].) 


Let’s try to isolate the point where f might first become unbounded. In particular, let’s start at the point x = a 
and move slowly rightward, and try to figure out exactly where the function might become unbounded. 


Clearly f is bounded on the interval [a,a] (which is just the point a), since f(a) is an upper bound. We move to 
the right, and keep track of where the function stays bounded. Specifically, let 


D = {x € [a,b] | f([a,x]) is bounded}. 


Clearly D is a bounded set (it’s a subset of [a,b]), and by our basic properties of IR, we know that D must have a 
least upper bound. Let d = sup D. Note that for any c such that a < c < d, the set f([a,c]) is bounded. 


First, we prove that d > a. By assumption, f is continuous at a, thus for any e > 0 we can choose 6 > 0 such 
that f((a,a + 5)) C (f(a) —e, f(a) + €). In particular, this implies that (a + 6) € D, sod >a+6>a. 


We next consider the possibility that d # b. Since f is continuous at d, we have lim f(x) = f(d). Thus, given 
Pe 


€ > 0, we can find 6 > 0 such that 
Os|x-di<6 => (lfx)-f@<e. 


Choose 5 small enough so that (d — 6,d + 6) C (a,b). Then 
f([a,d + 5)) = f([a,d — 5]) U f((d —6,d + 6)). 


The first term on the right is bounded since d — 6 < d, and the second term is bounded since it lies entirely within 
the interval (f(d) — e, f(d) + €). Thus we see that [a,d + 5) € D, but this contradicts the fact that d = sup D. 


Thus, we have shown that d = b. Since f is continuous at b, we have lim f(x) = f(b). Thus, given e > 0 we can 
xb- 


find 6 > 0 such that 
O<b-x<6d => (|f%)-fOl<e. 


Choose 6 small enough so that a < b — 6. Then 
F(a, b]) = f(la, b — 6]) U f(b — 6,6). 


The first set on the right is bounded since sup D = b, and the second set is bounded since it lies entirely within the 
interval (f(b) — e, f(b) + €). Thus we see that f([a, b]) is bounded, as desired. 
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CHAPTER 


———_—_———————_ 


THE DERIVATIVE 


Calculus provides the answers to two natural geometric questions regarding the graph y = f(x) of a function f. 


Question 1: Given a real number a in the domain of a function f, what 
is the slope of the line that is tangent to the graph y = f(x) at the point 
(a, f(a))? (a, f(a)) 


Question 2: Given real numbers a < b such that the interval [a, b] is a subset 
of the domain of a function f, what is the area of the region bounded by 
the graph y = f(x), the lines x = a and x = b, and the x-axis? 
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We'll defer our discussion of Question 2 until Chapter 5, when we will have some more tools at our disposal. 
Our immediate goal is to try to answer Question 1. The answer to Question 1 is the fundamental calculus concept 
known as the derivative. 


3.1 INTUITIVE INTRODUCTION 


Our first task is to decide what we mean by “tangent.” We can think back to plane 
geometry: given a circle C and a point P on C, there is a unique line ¢ through P that does r 
not intersect C at another point. This is called the tangent line to C at P. Every other line 
through P will also intersect the circle at some other point; such lines are called secant lines. 


“& 


Let’s provisionally use this as our definition of a tangent line to the graph of a function: 


Provisional Definition #1: A tangent line to the graph of y = f(x) at the point (a, f(a)), where a is in the 


domain of f, is a line € passing through (a, f(a)) that does not intersect the graph at any other point. 


Let’s try this provisional definition with a simple function: 


Problem 3.1: Let f(x) = x*. Find a “tangent line” (under our provisional definition above) to the graph y = f(x) 


at the point (1, 1). 


Solution for Problem 3.1: Any line (other than a vertical line) through (1, 1) will satisfy the equation (y—1) = m(x-—1), 
where m1 is the slope of the line. So y = mx + (1 — m). We want to find m such that the point (1,1) is the only 
intersection of y = x and y = mx + (1-_m). This means that (x, y) = (1,1) is the only solution to the system of 
equations 


y=2x, 


y= mx +(1—m). 


Eliminating y gives us x — mx + (m — 1) = 0, which is a quadratic equation in x. Clearly x = 1 is a solution to this 
equation (as we already know), but it will have another solution if the discriminant is positive. So for x = 1 to be 
the only solution, we must have m? — 4(m — 1) = 0. This simplifies to (m — 2)* = 0, so m = 2. (Note that for all 
other values of m, the discriminant is positive, so there are two solutions.) Therefore, the slope of the tangent line 
is m = 2, and the equation of the line is y = 2x-1. 0 


We can sketch a picture of this, as shown to the right, to convince ourselves that Ly 
we did the right thing. It does indeed look like the correct tangent line. This all looks y=x 
fine, but it seems like a lot of trouble to go through just to find a tangent line. Another 
issue is that there is another line that intersects y = x only at the point (1,1), and that 
is the line x = 1. We didn’t get that line in our solution above because its slope is 
undefined. (Remember our “other than a vertical line” qualifier in the first sentence 
of our solution to Problem 3.1.) 


Moreover, we can see that in any graph y = f(x), the line x = a will intersect the (1,1) 
graph in only one point, namely (a, f(a)). This is the Vertical Line Test for graphs of 
functions: there is only one y-value for each possible x-value in the domain. x 


We can revise our provisional definition to exclude vertical tangent lines: 
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Provisional Definition #2: A tangent line to the graph of y = f(x) at the point (a, f(a), where a is in the 
domain of f, is a line €, not parallel to the y-axis, passing through (a, f(a)) that does not intersect the graph at 


any other point. 


But there’s a more significant flaw, as we'll see with a slightly more complicated example: 


Problem 3.2: Find the equation of the tangent line to the graph of y = x° + x at the point (1,2). 


Solution for Problem 3.2: A non-vertical line through (1,2) has the equation (y — 2) = m(x — 1). So we are looking 
for a value of m that gives only the solution (x, y) = (1,2) to the system 


y=xrtx, 
y = mx + (2 —m). 


Setting the two equations equal gives x* + (1 — m)x + (m — 2) = 0. We know that x = 1 is always a solution, so we 
can factor the equation and get (x — 1)(x? + x — (m—2)) = 0. We need the cubic to have no other roots, so either the 
quadratic has the double root x = 1, or the quadratic has no roots (since we want x = 1 to be the only root of the 
original cubic). The only monic quadratic with double root x = 1 is x* — 2x + 1, which clearly does not match our 
quadratic, so the only possibility is for the quadratic to have no root. 


Thus, we need the discriminant of the quadratic to be negative. The discriminant is 1 + 4(m — 2), which is 
negative when m < 2. So any line with slope less than 7 through (1,2) intersects the graph of y = x* + x only at 
the point (1,2). O 


This is a bad state of affairs. We have a whole range of “tangent” lines. Worse, none of Y 
these lines really look like good candidates for a tangent line, as they all “cross” the graph, 
as shown in the diagram to the right. If we think back to our original circle example, the y=eu4ux 
tangent line through a point on a circle just “touches” the circle without “crossing” it. We’d 
like that property to be true of more general tangents to graphs. 


We can see a little bit of what’s really going on here if we look at the value of m that \ 
makes our quadratic x* + x — (m — 2) also have a root of x = 1. Plugging in x = 1 gives 
2—(m—2) = 0,so m = 4. If we draw the line with slope 4 through the point (1,2), which (1,2) 
is the line y = 4x — 2, then we get the picture at left below. This looks like what we should x 


have as our “tangent line” to y = x° + x at (1,2): it just “touches” the curve at (1,2), but doesn’t cross it. However, 
this line has the unfortunate property that it intersects the curve at another point, namely (—2, —10), as shown in 
the picture to the left below. Should we be worried about this? 


y The number of words that appear inside quotation marks in the last two paragraphs is a 


sign that we’re being too vague. So we'll try to define precisely what a tangent line is, but as 
we’ve seen, this is not easy. 


(1,2) 


However, what is easy to define is a secant line: 


So given two points P = (a, f(a)) and Q = (b, f(b)) on the graph of y = f(x), with a # b, the line PO is a secant 
line. Unless the graph of y = f(x) is a line—that is, unless f(x) is a linear polynomial or a constant—we expect that 


there will be infinitely many secant lines through P. However, if Q is very close to P, then the secant line ive is 
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very close to what we want to be the tangent line at P. In fact, as Q approaches P, then the secant line approaches 
the tangent line: 


y y y 
Q 
Q 
P P P “ 
* ac Bs 


“Approaches”. . . this sounds like a job for a limit! 


3.2. DEFINITION OF THE DERIVATIVE 


Let’s take our informal discussion from the end of the previous section and try to make it rigorous. 


Problem 3.3: Let P = (a, f(a)) be a point on the graph of y = f(x). We wish to compute the slope of the tangent 
line to the curve at the point P. 


Let Q = (b, f(b)) be another point on the curve (so that b # a). Compute the slope of the secant line PO. 
Redo part (a) where we let b = a + h for some h # 0, so that Q = (a +h, f(a + h)). Write the slope in terms 


of a and h. 
In terms of a limit, what does it mean for Q to “approach” P? 
Write an expression (using a limit) for the slope of the tangent line at P. 


Solution for Problem 3.3: 


(a) We start with a secant ize) where Q = (b, f(b)) is some point other than P. 
—_> 
The slope of PQ is, as usual, the difference in the y-coordinates divided 


by the difference in the x-coordinates: 


slope of PQ = ew. 


(b) Instead of writing Q in terms of a “new” x-coordinate b, let’s write it 
relative to P; that is, let’s set b = a +h for some nonzero real number h, so 
that Q = (a +h, f(a +h)). Our expression for the slope then becomes: 


fe+h)-Fe@) _ JOr+)-fo) 


lope of PG = 
a al ha (a+h)-a h 


59 


CHAPTER 3. THE DERIVATIVE 


(c) As Q approaches P, the number h approaches 0, and the secant line ize) y 
“approaches” the tangent line at P. This last clause—the secant line 
“approaching” the tangent line—is not well-defined, but the part about h 
approaching 0 is: it’s a limit! 

(d) We can define the tangent line at P as the line through P whose slope is the 

> 
limit of the slope of the secant PQ as Q approaches P. This gives us: 


flat 2 f(@) 


Q 
f(a +h) - f(a) 


slope of tangent line at P = lim 


O 


To summarize, we have constructed the following definition (non-provisional at last!) of a tangent line to the 
graph of a function: 


Definition: The tangent line to the curve y = f(x) at the point P = (a, f(a)) is the line passing through P with 
ee f(at+h) - f(a 


slope 


h-0 h 
provided that this limit is defined. 


The quantity described by the above limit is so fundamental to calculus that it has a name: 
Definition: The derivative of the function f ata, denoted ft ‘(@), is 


£@= = in FO dig 


provided this limit exists. If the limit exists, 1 we say t that < is differentiable at a. ‘Wess say that fi is differentiable 
if it is differentiable at every point in its domain. _ 


We see that the derivative f’(a), by definition, is the slope of the tangent line at (a, f(a)), provided both are 
defined. 


Note that we could have instead defined 


fO-f@ 
f'@) = lim —— 


and this is in fact the exact same quantity as our definition of derivative, but it is usually easier to compute limits 
when something is approaching 0, as in lim. There are other minor variations in the form of the definition of the 
= 


derivative; you will explore some of these as some exercises. 


Let’s go back to our earlier example: 


Problem 3.4: Let f(x) = x?. Find the tangent line to the graph y = f(x) at the point (1, 1). 


Solution for Problem 3.4: By definition, this line has slope f’(1), so we compute: 


/ 
ny (h+1)*-1 
h0 h 


fie! 


We can simplify the expression for h # 0: 


aon £4") fA) 
h-0 h 


(h+1%-1_ 2 +2h 


i alent =h+2, 
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and therefore 
f'() = lim(h +2) =2. 


Thus the tangent line has slope 2 and passes through (1,1). Therefore, it is the graph of y = 2x-1. 0 


Rather than just consider the derivative point-by-point, we often think of the derivative of f as a function: 


Definition: The derivative of a function f is a function f’ such that 
f(x +h) - f@) 
h 


f(x) = lim 


at every x in the domain of f where the limit is defined. 


Let’s go back to another of our earlier examples: 


Problem 3.5: Find the derivative of f(x) = x° + x. 


Solution for Problem 3.5: We write the limit: 


fe) = im LEAF) 
3 3 

a (x3 + x) 

(8 +3x2h + 3xh? +h +x +h) -— (22 +x) 
Srey Tate ere oe 
. 3x7h + 3xh? +h? +h 
se ny ———— ——- 

h0 h 

= lim(3x* + 3xh + h? + 1) 
= 3x7 +1. 


Oo 


Now, going back to Problem 3.2, we can immediately see that the slope of the tangent line to the graph of 
f(x) = x8 + x at (1,2) is f’(1) = 3(1)* + 1 = 4, and thus the tangent line at (1,2) is y = 4x - 2. 


The derivative of f(x) has many different notations: 


' : df d 
fe Df foe 2 Ffw 
We'll usually use the first notation. The second notation is called operator notation, but we will not use it 
anywhere in this book. The third notation (the function with a dot over it) is commonly used in physics, but will 
also not appear elsewhere in this book. The last two notations in the above list are called differential notation 
and we'll see more of these later in the chapter. The notations can be used both with functions that are explicitly 
named and functions that are not; for example, if f(x) = sin x + x°, we could write any of 


d 
f'(x), (sinx + x°Y, a 4 f(x), 4 (sin +x) 


to indicate the derivative of f. 


Just because we have a useable definition of the derivative, this doesn’t mean that the derivative is always 
defined. There are basically three different circumstances in which the derivative f’(a) might be undefined for a 
function f and a real number a in the domain of f. We’ll explore these circumstances in the next 3 problems. 
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Problem 3.6: Suppose f is a function such that f is differentiable at a (recall that this means that f’(a) is defined). 


Compute lim f(a +h) in terms of f(a) and f’(a), and explain why this means that f must be continuous at a. 


Solution for Problem 3.6: We start with lim f(a+h), and do some algebra to get the definition of f’(a) into the 
picture: 

lim f(a +h) = lim (f(a +h) — f(a) + f(a)) 
im (2° (a +h) — f(@) 


sos 


h+ fea) 


se ai ss fim f(@). 


im (“= * “ f). 


The first term in the last line above is just the definition of f’(a), which exists since we are assuming that f is 
differentiable at a. Evaluating the other limits in the above expression, we have 


lim f(a +h) = f'(a) +0 + fla) = fla). 
Thus, lim f(a+h) = f(a). But this is exactly the same thing as saying that lim nf (x) = f(a), and hence f is continuous 
ata. O 


So Problem 3.6 tells us the following fundamental fact: 


Differentiability implies continuity. That is, f’(a) is defined only if f is continuous 


ata. If f is not continuous at a, then f’(a) is undefined. 


| Important: Continuity does not imply differentiability. | 


Indeed, the next problem is an example of a continuous function that is not everywhere differentiable: 


Problem 3.7: Let f(x) = |x| be the absolute value function. 
(a) What is f’(x) if x > 0? 

(b) What is f’(x) if x < 0? 

(c) Attempt to compute f’(0), and verify that is undefined. 
(d) Explain, in terms of the graph of f, why f’(0) is undefined. 


Solution for Problem 3.7: 
(a) Ifx>0, then f(x) = |x| = x. So, if we restrict ourselves to h sufficiently close to 0 so that x +h > 0, we can write 


ble F(xth)-—f@)_.. bke+hl—-bel_. (Qxth)—-x_,. hh 
f°) = firs h a 2 Dobe = ae ae! ioe 
So f’(x) = 1 for all x > 0. 


(b) Ifx <0, then f(x) = |x| = —x. As in part (a), if we restrict ourselves to h sufficiently close to 0 so that x +h < 0, 
we can write 
; eel tet f(x) ll: Cdl i Sl Ald (a) Sa? a 

Ee fe ey ry ee h oe 
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So f’(x) = —-1 for all x < 0. 
(c) We compute: 
ent TS Tel = 
ee 


h 
But we see a problem with this limit. The function C is not continuous at 0, since 
lh| _ 1 ite> 0, 
~) -1 ifh<0O. 
The graph of |h|/h is shown at right, and we can see that it is not 
continuous at 0. More specifically, we see that 
lhl _ 


a ; 
ay 5 oe an hig a1. 


Since the left-sided limit and the right-sided limit are different, we 


conclude that lim a does not exist. Thus, f’(0) is undefined. 


(d) We sketch the graph of f(x) = |x| below: 


y=I|z| 


x 


The x > 0 portion of the graph suggests that the slope of the tangent line at x = 0 should be 1. But the 
x < 0 portion of the graph suggests that the slope of the tangent line at x = 0 should be —1. This comes about 
because there is a sharp corner in the graph at the point (0,0), so we can’t have a uniquely defined tangent 
line at that point. 


Sidenote: Rather bizarrely, it turns out that there are functions that are continuous every- 
where on R but are differentiable at no points of IR. However, it is very difficult to 
write down the description of such a function (and virtually impossible to draw 


its graph). 


Problem 3.8: Let f(x) = Vx. 
(a) Explain why f’(0) is undefined using the definition of derivative. 
(b) Explain why f’(0) is undefined in terms of the graph of f. 


Solution for Problem 3.8: 
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(a) We can try to compute f’(0) using the definition of derivative: 


tg LO i ee See 


h0 h  h90 Vy 


But this limit does not exist: ei denominator grows arbitrarily small as h — 0, so the fraction grows arbitrarily 


im 


(b) Shown at right is the graph of y = x, with a “tangent line” drawn in y 
at the point (0,0). We see that, geometrically, the natural choice for the 
tangent line is the vertical line x = 0. However, recall that vertical lines y= vx 
are not allowed as tangent lines, since by definition f’(x) is the slope of 
the tangent line at x, and a vertical line has undefined slope. Thus, f’(0) x 
is undefined. 


f'O= 


large. (We may say that lim =— = +00, as we will explore further in Chapter 6.) 


Concept: _ If the graph of y = f(x) has a vertical “tangent line” at (a, f(a)), then f’(a) is 


undefined. 


We summarize the non-differentiability criteria explored in the previous three problems: 


Important: Given a function f and a number a € Dom f, there are essentially three reasons 
VY why the derivative f’(a) might be undefined: 


e f isnot continuous at a 


e The graph of f has a sharp corner at (a, f(a)) 


e The graph of f has a vertical tangent line at (a, f(a)) 


There are occasionally more exotic reasons why f’(a) might be undefined, but virtually all of the examples that 
you will commonly encounter fall into one of the above three categories. 


EXERCISES 

3.2.1 Use the limit definition of derivative to compute the derivatives of the following functions: 
(a) f(x) =c where c is any real number (b).f@ = 

©) fa=2-% @ f@O=0-1 © fore 


3.2.2 Find the equation of the tangent line to the curve y = x* + 3 at the point (1, 4). 


3.2.3 Find the equation of the tangent line to the graph f(x) = E at the point (2, 2). 


3.2.4 Find a function f, with domain R, that is continuous on all of IR but that is not differentiable at x = 0 and 
Kim ds 


3.2.5% Suppose we had defined the derivative as 


Pec lim LO= = hy 


Is this equivalent to our previous definition? hier or a not? Hints: 93, 282 
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3.3. BASIC DERIVATIVE COMPUTATIONS 


In this section, we'll start to explore some of the basic algebraic properties of the derivative. Our goal is to 
develop a catalog of derivatives of common functions (such as polynomials, trig functions, and exponentials), and 
a series of algebraic rules to compute derivatives of more complicated functions. Hopefully, by the end of this 
chapter, you'll find that calculating derivatives is a very routine process. 


We'll start with a basic algebraic property of the derivative: 


Solution for Problem 3.9: This property is true for limits, and since the derivative is defined as a limit, we shouldn't 
be too surprised that this property is true for derivatives as well. The proof is a straightforward application of the 
corresponding rules for limits: 


(Ff + gie+h)—(f + gl) 


(f + 9)'(x) = lim ; 
s lim f(x +h) + g(x at — fy a) 
‘ in (2 +h) - fa) , ge+h)- a) 
h-0 h h 
in (Z +h) ii : img ( +h) a 
h0 h h0 h 


= f'(x) + g(x). 


It is also true that: 


Important: _Let f be a differentiable function, and c € R. Then (cf)’ = c(f’). 


The proof is very similar to the solution to Problem 3.9, and we will leave the details as an exercise. 


The previous results mean that the derivative is linear. The derivative of a sum 


of functions is just the sum of the derivatives of the functions, and we can “factor 
out” a constant factor from a derivative. 


The limit definition of derivative is somewhat cumbersome to use. We'd like to build a catalog of common 
derivatives. So let’s start with some simple functions. 


Problem 3.10: Let f(x) = x", where n is an integer. Find f’(x). 


Solution for Problem 3.10: First, let’s look at the case where n is a positive integer. We just plug x” into our limit 


definition of derivative: floe+h) — fx) 
ioo5 ee tok tt) = fe Cee hy = x" 
PROTES h ni armas lee 
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We can expand the (x + 1)” term in the numerator using the Binomial Theorem: 


(x" + (xh + x2? ++ + ht) — x" 


f'(x) = lim h 


Now we want to simplify. Notice that the x” terms cancel in the numerator. Then, every remaining term has at 
least one factor of h, so we can divide by the h in the denominator: 


f'@) = lim (ihe . (;}es rset We), 


Every term except for the first term has a factor of h, so when we take the limit as h approaches 0, these terms are 
all 0. Thus we’re just left with the first term, and since (7) = n, we have 


ifAx)s nt. 


We'll state this result using differential notation: 


Important: _ If is a positive integer, then 


Notice that when n = 1, we get > = 1. This makes perfect sense when we think about the graph of y = x: 
the graph is just a line with slope 1, so the tangent line to this graph should have slope 1 everywhere. 


When n = 0, the function is just the constant function f(x) = x° = 1. We can again use the limit definition of 
derivative: 


ox ae. F(x +h) - fe) 
ee 
. 1-1 
ae 
=1im.0:=.0. 
h-0 


So 41 = 0. This also make graphical sense: the graph of y = 1 is a horizontal line, which has slope 0 at every 
point. Note also that this function satisfies our earlier formula (for x # 0): 


Finally, we consider the function f(x) = x~” where m is a positive integer. We again try to use our limit 
definition: 


1 1 
ion wet ets IOS 4. ear 
he aces: a 
We can simplify the numerator by subtracting the fractions: 
x"—(x+h)" 
Gath, XM — (x +h)” 


PS Be ae 
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As we did before, we use the Binomial Theorem to expand the numerator, cancel the x” term, and divide the rest 
of the terms by h. This leaves: 


“Get + eth 


P(e) = Tiny (x(x +My” 
If we now plug inh = 0, all of the terms except for the first disappear from the numerator, and the denominator 
becomes x”. So we have: 

—mx""1 m 


7 = = ae —m—1 
f') = yx2m y+ sai , 


Notice that this is the same formula that we got for positive integer exponents! 0 


Important: _ If is an integer, then 


G rig sid 
x Siig ey 
We can think of this as “moving the exponent to the front as a constant, and 


then decreasing the exponent by 1.” 


We can now combine the results of the previous two problems to instantly compute the derivative of any 
polynomial function, even a “polynomial” with negative powers. 
Problem 3.11: Compute the derivatives of the following functions: 
(a) f(x) =x - 2x7 + 6x-9 
(b) f(x) = 3x!! — 5x? + 188x 


(©) fx) =35-2+3-8x4 
(d) f(x) = @x+3/ 


Solution for Problem 3.11: 


(a) We'll work out all the steps of this one, just so it’s clear what we're allowed to do. 


f' (x) = (x? — 2x7 + 6x - 9)’ 


= (x°)' + (—2x7) + (6x)! + (-9)’ (by Problem 3.9(a)) 
= (x°) — 2(x)’ + 6(x)’ - 9(1)' (by Problem 3.9(b)) 
= 3x? — 2(2x) + 6(1) — 9(0) (by Problem 3.10) 
= 3x7 — 4x +6. 


Once you’ve mastered the process, there’s no need to be so pedantic. Just immediately write: 


(x — 2x7 + 6x — 9)’ = 3x7 — 4x +6. 


(b) This time, we'll omit most of the intermediate steps: 


(3x! — 5x? + 188x)’ = 3(11x!°) — 5(9x5) + 188(1) = 33x! — 45x + 188. 


(c) Negative exponents are no problem: we just use the same rule. 


(5x77 — 2x71 +3 - 8x4)’ = -35x-8 + 2x? — 3223. 
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A very common mistake, particularly with negative exponents, is increasing 
the exponent when you should be decreasing it. For example: 


(5x77) # -35x7°. 


Make sure that you decrease the exponent by 1—in this example, from —7 to 
-8: 


(5x77) = -35x°*. 
(d) Be careful! It’s tempting to do this: 
Bogus Solution: a 
pia sy (2x +3) = 32x +37 


But we don’t have a rule (yet) for taking the derivative of things like (2x + 3)°. We can only take derivative 
of monomial terms like cx”. So we need to first expand (2x + 3)° into a sum of monomial terms, using the 
Binomial Theorem: 


(2x + 3)? = (2x)? + 3(2x)?(3) + 3(2x)(3)? + 3° = 8x? + 36x? + 54x + 27. 
Now we can take the derivative: 
f(x) = (8x3 + 36x? + 54x +27) = 24x? + 72x + 54. 


Note that this is not equal to 3(2x + 3)? = 3(4x? + 12x + 9) = 12x* + 36x + 27. But the derivative is exactly twice 
3(2x + 3)?. We'll see more of this interesting property in the next section. 


O 


We know how to take derivatives of sums of functions and of constant multiples of functions. What about 
derivatives of products of functions? 


Problem 3.12: Let f and g be differentiable functions. Find a formula for (fg)’ in terms of f, g, f’, and 9’. 


Solution for Problem 3.12: Many people are tempted to say: 


Bogus Solution: 
(Fay = (FG. 


Sorry, but it’s not that simple. In the vast majority of examples, the above “formula” is not true. For example, let 


f(x) = x? and g(x) = x°. Then (fg)’ = (x°)’ = 5x4, but (f’)(g’) = (x?) 28)’ = (2x)(3x”) = 6x3, so clearly (fg)’ # (f’)(g’). 
What do we do instead? We go back to our limit definition! 


(fg)'(x) = lim A = lim iene see 
Uh-oh. There doesn’t seem to be any nice way to simplify this. 


We'll have to use some algebraic slight-of-hand. We want to force into our expression some terms that look 
more like the terms in the definition of derivative, so let’s insert a couple of carefully-chosen expressions into the 
numerator: 


(f9)'() = Him f(x + h)g(x +h) — f(x)g(x) _ lim f(x + h)g(x +h) — f(x + h)g(x) + f(x + h)g(x) — fx)g(*) 
8 ~ 0 h ~ h0 h ; 
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This allows us to conveniently group the terms: 


(fgy'(@) = Tin f(x + h)g(x + h) — f(x + h)g(x) + f(x + h)g(x) — f(x) g(x) 
8 ~ 90 


— h 
ie ee BSG IB SORTS Ne 


It’s even more clear what's going on if we break it up further: 


fe+h)- fe) 


g(x+h)—g(x) 
h ud h 


(fg) (x) = lim f(x +h) im g(x) 


Now we see that the limits of the fraction above are just the derivatives of f and g! And since lim f(x +h) = f(x) 
(remember, f must be continuous, since it is differentiable), we conclude that 


(fg) (x) = f(x)g’ (x) + g(x) f" (>). 


Important: The Product Rule for derivatives: if f and g are differentiable functions, then 


fa) = f'st fia 


We can quickly check this with our bogus example from earlier. If f(x) = x” and g(x) = x°, then (fg)(x) = x°, so 
(f.g)’(x) = 5x*. On the other hand, the Product Rule gives: 


(fF 9)’ (x) = f’(x)9(x) + f(x)9" (x) = (2x) (2°) + (x?)(Bx) = 2x4 + 3x4 = 5x*, 
There is a similar (but, alas, somewhat uglier) formula for the derivative of a quotient: 


Important: The Quotient Rule for derivatives: if f and g are differentiable functions, then 


(j-tate 


We will leave the proof of this formula as an exercise. 


Let’s continue building our library of derivatives. Next up is trig functions: 


Problem 3.13: Compute < sin x). 


Solution for Problem 3.13: Let f(x) = sinx. Again, we'll use the limit definition: 


fix) = lim sin(x + ul - sin(x) 


At first, this looks intractable, but we can break up sin(x + h) using the sine angle-addition formula. This gives us: 


. sinxcosh+sinhcosx — sinx 
EG) = lig rerrrerretgger ere Taras 
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This is easier to work with when we factor out the sin x and cos x terms: 


Hpsiek oa _. cosh—1 . sinh 
f= sini i } + cos (im i ). 


Fortunately, we know those trig limits. Recall from Problem 2.8 that lim ann = 1. A similar method can be used 


cosh—1 


to show that lim 
have that f’(x) = cosx. 0 


= 0 (we will leave the details for you to work out if you like). So, plugging these in, we 


A similar calculation will show that Rae x) = —sinx. (Note the minus sign!) We will leave this calculation 


as an exercise. Then, we can use the rules for derivatives of products and quotients to compute the derivatives of 
the other trig functions. We will do one as an example: 


Problem 3.14: Compute é tan x. 


Solution for Problem 3.14: This is a simple matter of writing tangent as a quotient and then applying the Quotient 
Rule: 


Ld tanx = 2. (= od 
dx dx \cosx 
_ (sinx)’(cos x) — (sin x)(cos x)’ 
- cos? x 
_ cos? x + sin? x 
= cos? x 
1 


== sec? x. 
cos? x 


We can easily compute the derivatives of the remaining trig functions: 


Rants ee 
qx Sin x) = cosx ees 


d : d 
Fy 008%) = —sinx ay cses) = —cscxcotx 


4 (tanz) = sec?x 4 (cotx) = —cse?x 


We will leave the computations of the derivatives of sec, csc, and cot as exercises. 


There are a couple of other basic functions that we will want in our derivative catalog, namely exponentials 
and logarithms. Unfortunately, we don’t yet have the tools to prove these derivatives, so for now we will just 
have to tell you what they are; later in the book we will be able to prove them. 


One derivative hints at a large part of the reason why the number ¢ is special: 
& 
dx 


That is, e* is its own derivative! In fact, f(x) = ce* (where c is any constant) are the only functions for which f’ = f, 
although this is hard to prove. We’ll revisit this concept later in Chapter 9 when we study differential equations. 


e =e. 
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The flip side of e being nice is that the natural logarithm function has a nice derivative too: 
d 1 
Fx 108 x) = = 


In fact, one way that we can define the natural logarithm is as a function whose derivative is 1/x. (We’ll discuss 
this more in Chapter 5.) 


EXERCISES 
3.3.1 Prove that (cf)’ = c(f’) for any differentiable function f and real number c. 
3.3.2 Compute the derivatives of the following: 


(a) 4x°+3x-8-2x? (b) (3x? +2)° (c) (x7 +1)sinx (d) cos2x 
(e) e (f) (Slogx)(1 + tanx) (g) logx? (h) xe*(sec x) 
3.3.3 
(a) Prove that if g is a differentiable function, then 
ad (=) 2 
dx \ g(x) (g(x)? 


for any x € Dom(g) such that g(x) # 0. 
(b) Prove the Quotient Rule: if f and g are differentiable functions, and 9(x) # 0, then: 


(Z ) ise B(x) f(x) — fx)g’(x) 
g (g(x)? ; 
Hints: 276 


3.3.4 Prove that £ (cos x) =-sinx. 


3.3.5 Use the derivatives of sin and cos, together with the Product and Quotient Rules, to find the derivatives 
for the trigonometric functions sec, csc, and cot. 


3.4 THe CHAIN RULE FOR DERIVATIVES 


In Problem 3.11(d), we saw that 
d pe °) 
ax + 3)° = 6(2x + 3)*, 
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although computing this required us first to expand (2x + 3)* using the Binomial Theorem and then take the 
derivative of each monomial. We'd like a quicker way to compute such a derivative. 


Let’s look at another example: 


Problem 3.15: Our goal is to find an easy way to compute 


d 
a + 1° 


without applying the Binomial Theorem. 
First compute it the long way, by expanding (x? + 1)° using the Binomial Theorem and then taking the 
derivative. 


Does your answer from part (a) factor nicely? 
Let f(x) = (x? + 1) and g(x) = x°. Write the function (x? + 1)° as a composition of f and g. 
Can you algebraically manipulate the derivative of the composition to write it in terms of f, f’, g, and g’? 


Solution for Problem 3.15: Before we begin, note the following incorrect solution: 


Bogus Solution: - 
SO? + 1) = 5? + 1) 


= d : i : . 
This is not correct; we cannot take our at = 5x* rule and directly apply it to more complicated expressions 


raised to a power. 


(a) Consider the function h(x) = (x2 + 1)°. We can certainly compute its derivative by expanding it into a 
polynomial using the Binomial Theorem: 


h(x) = x'° + 5x® + 10x° + 10x4 + 5x? + 1, 


and hence 
h’(x) = 10x? + 40x” + 60x° + 40x? + 10x. 


(b) Our answer from part (a) is somewhat ugly, especially compared to the relatively nice expression that we 
started with. But, if you’re observant, you might notice that the above polynomial factors quite nicely: 


h’ (x) = 10x(x° + 4x° + 6x4 + 4x7 + 1) = 10x(x? + 1). 


That's a lot nicer, and it seems related to our original h(x) = (x? + 1)°. 
(c) If we set f(x) = (x* + 1) and g(x) = x°, then h(x) = (go f)(x) = g(f(x)) = (x? + 1); that is, h is the composition 
of g and f. 


(d) At the beginning, if we had been naive and treated (x? + 1)° as if it were x°, then we might have made the 
bogus conclusion that its derivative is 5(x* + 1)*. As we saw, this is incorrect, by a factor of 2x, as the actual 
derivative is 

h’ (x) = 10x(x? + 1)*. 


Where did the extra factor of 2x come from? Hmmm... if we look at what’s inside the parenthesis, we see 
the interesting coincidence that 


d 
Gy +1) = 2x. 
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That’s no coincidence! We can break up h’(x) as follows: 

h’ (x) = 10x(x + 1)* = (2x)(5(x? + 1)’). 
We see that the first term is just f’(x). The second term is g’(f(x)): we plug f(x) = (x? + 1) into 9’(x) = 5x* to 
get e’(f(x)) = 5(x? + 1). 


Based on our work on Problem 3.15, our conjecture is: 


Important: __ If f and g are differentiable functions, and if (¢ 0 f) exists, then 


— &ON@=sFODF@). 
“hid io Wwe ak to Chain Rule? 


We can get some idea of why the Chain Rule is true by looking at the limit definition of the derivative of (¢ 0 f): 


(g 0 f(x) = lim SEE *) = 8) 


Since we know that we want an f’(x) term to appear, we expect that we’ll need to have f(x + h) — f(x) in the 
f(x +h) — f(x) 
f(x +h) - f(x) 
f(x +h) — f(x) # 0, so for the moment assume that f(x + h) # f(x) for sufficiently small nonzero h. Then we have 
tony te. ( SFR +H)—3F@) f&e+h—- fe) 
0 (0) = in| fe+h-fa ih 
g(f(x+h))-sf)\ [,.. fae+h) — fe) 
= (lg foe +h)= fx) ) (jg h } 


This second term is just the definition of f’(x), so we have 


, g(f(x +h) ~ 8F@)) ,, 
9110 = in Fay 


So all we have left to do to establish the Chain Rule is to show that 


(f(x +h)) — s(f()) 
mo fee+h—foy — &0G) 7 


numerator somewhere. So let’s multiply the above limit by Of course, this is only valid if 


This seems plausible: as f(x + h) approaches f(x), the fraction on the left side of (*) is the slope of the secant line 
from (f(x), g(f(x))) to (f(x +h), g(f(x +h))), so its limit as h approaches 0 should be the derivative of g at f(x), which 
is 3’(f(x)). 


In fact, (x) is true provided we make the assumption that f(x+h) # f(x) for sufficiently small nonzero h, but the 
proof is a somewhat tedious 6-e argument. We also have not addressed the issue of what happens if f(x +h) = f(x) 
for values of h arbitrarily close to 0. We can fix these issues and get a rigorous proof of the Chain Rule, but the 
proof is rather technical, so we will defer the proof to Section 3.A. 


It is often easier to remember the Chain Rule using differential notation. To do so, we write g(f(x)) = g(u), so 
that u = f(x). Then the Chain Rule is 
dg _dg du 


dx du dx 
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For example, consider the function in Problem 3.15, (x* + 1)°. We'll write this as g(u) = u°, where u = x* +1. Then, 
applying the Chain Rule, we have 
OE gale gee 
= Su* + 2x 
= 5(x* + 1)*- 2x = 10x(x? + 1)*. 
Note especially the final step: we want our final answer to be solely in terms of x and not have any u terms. The 


advantage of thinking of the Chain Rule using this notation is that we can informally remember the Chain Rule 
as “canceling the du terms” when we multiply the “fractions” dg/du and du/dx. 


In practice, we usually won’t go to all the trouble to write out dg/dx or (go f)’ or anything like that. Once you 
become adept with the Chain Rule, you'll just write the whole calculation out on one line, like so: 


(x7 +1)? = B(x? + 1)8(2x2 + 1)’ = 5x? + 1)4(2x) = 10x(27 + 1)*. 


Problem 3.16: Compute the following derivatives: 
d 3 : ; 
rca = 5x) 
rte 


dx (1—x4)2 
d 
ae = 1) + 9x)® 


Solution for Problem 3.16: All of these are straightforward applications of the Chain Rule. 
(a) Let f(x) = 2x* — 5x and g(x) = x°. Then our function is ¢(f(x)), so by the Chain Rule, we have: 


£ ox — 5x)? = 9’ (f(x) f(x) = 3(2x — 5x)?(4x — 5). 


(b) Write the function as (1 — x*)~*. Then apply the Chain Rule: 


8x3 


No — w4y-3/ =s 
(l-2)°Y = -201 — x") (40°) = as 


(c) We apply the Chain Rule twice: 


(((x? — 1)? + 9x)®)’ = 6((x7 — 1)? + 9x)P((x? — 1)? + 9x)’ 
= 6((x° — 1)? + 9x)9(2(x2 — 1)(7 - 1)’ +9) 
= 6((x° — 1)? + 9x)°(2(x3 — 1)(3x?) + 9) 
= 6((x° — 1)? + 9x)>(6x7(x° — 1) + 9). 
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The Chain Rule works equally well with “special” functions too. As part of the next problem, we will use the 
“u” notation for the Chain Rule computation. 


Problem 3.17; Compute the derivatives of the following functions: 
(a) cos(4x) 


(b) & 
(c) log(sin x) 


Solution for Problem 3.17: 
(a) Write the function as cos(u), where u = 4x. Then the derivative is 
du 
d. 


(cos(u))’ - ay (—sin(u)) - < (ax) = —4sin(u) = —4sin(4x). 


Note the importance of the final step: make sure that the final answer is in terms of the original variable, and 
does not contain any temporary variable or function that we introduced to help our Chain Rule calculation. 
Eventually you will get proficient at this type of computation, and you won't need all the intermediate 
steps—you can just write 


£ (cos(4x)) = —4sin(4x). 
(b) Write as e“ where u = x°. Then 
ey =¢Y =e. o =¢! 32? = 3x72". 
(c) We'll consolidate the steps without using u: 


Lae (sin x) = my f ues pperen se cotx 
dx 8 ~ sinx dx ™ ginx” ; 


One neat application of the Chain Rule is the formula for the derivative of an inverse function. 


Solution for Problem 3.18: Of course, we know that f(f~'(x)) = x, by definition, so 


4 5p tq) = Za =1, 
But by the Chain Rule, 
1= SAP e) = Fee. 
Thus, assuming that f’ is nonzero, we have 


1 


—1 fe, 
ON FS 
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Important: The Inverse Function Rule: if f is a function with inverse f~!, then 


(f"@y = 


peer ee 
FFA) 


wherever f’(f~')(x) # 0. 


This result opens up a lot more derivative computations for us. A particularly important one (whose importance 
we will see in Chapter 5) is: 


Problem 3.19: Compute 4 sin x). 


Solution for Problem 3.19: By the Inverse Function Rule, and since < (sin x) = cos x, we have 


a 
—(sin™! x) = ————_.. 
dx ( ) cos(sin™! x) 


But for any x € (—1,1), we have cos(sin! x) = V1 —.x?, where we take the positive square root since sin"! x € 
(-3, ), Thus, 


Cy ay gs 
az (sin x)= —s 


QO 


We will leave it as an exercise to compute 


1 


d ai 
a ss 1+ x2’ 


the importance of which we will also see in Chapter 5. 


Problem 3.20: Compute 4 vx. 


Solution for Problem 3.20: Think of x as the inverse function of f(x) = x* when x is a positive real number. 
Therefore f’(x) = 2x, and we can use the Inverse Function Rule, giving: 


Note that if we had written out initial function as Vx = x2, then this is consistent with our exponent derivative 
rule: 


Oo 


Important: For any rational number r, 
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In fact, the result is true for any real number r, rational or irrational, as we can see with the following 
computation: 


Problem 3.21: 


(a) Recall that in Chapter 1, we assumed that (e*)? = e* for any a,b € R. saga te haa = exp(rlog x) 
for any x > Oandre R. 


(b) Use this to prove that oy =7x", 


Solution for Problem 3.21: 


(a) We recall that x = exp(log x), since exp and log are inverses. Thus 


x” = (e°8*)" = e” 8* = exp(rlog x). 


(b) By the Chain Rule, and the fact that te = e*, we have 


d 
oy Tr exp(r log x) 
= exp(r log x) - (rlog x)’ 

r 

= exp(rlog x) - i 


r = 
at = = ort 


O 


EXERCISES 
3.4.1 Compute the derivatives of the following functions: 


(a) (3x* +x)? (b) x2—1 (c) log(x* + cos x) (d) sin yx 


(ec) ee? (f) sin! e? (g) V1+(?+1)5 (h) & 
, d d 

3.4.2 Assuming that ae = e*, prove that Fx 108 x)=- 

3.4.3 


(a) Letn bea nonzero integer. Show that x! = = y(t) , 


(b) Let q be a nonzero rational number. Use part (a) to show that ix get. 


3.4.4x Compute $ (tan x). Your final answer should not contain any trig functions. 
3.4.5x The derivative of the derivative of f is called the second derivative of f, denoted f”. A nonzero 


polynomial f(x) with real coefficients has the property that f(x) = f’(x)f’(x). What is the leading coefficient of 
f(x)? (Source: HMMT) Hints: 261 
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3.5 ROoLie’s THEOREM AND THE MEAN VALUE THEOREM 


Consider a continuous function defined on a closed interval [a,b]. By our work in Chapter 2, we know that 
this function must attain a maximum at some point c € [a,b]. What do we know about the derivative at this point? 
Looking at a sample sketch should give us some idea. We see that the graph tends to “flatten out” at its maximum, 
and thus the tangent line at the point (c, f(c)) is horizontal. 


This gives us a conjecture: 


Problem 3.22: Let f be a continuous function on a closed interval [a,b], and let c € (a,b) be a point at which f 


attains its maximum on [a,b]. Show that if f is differentiable at c, then f’(c) = 0. 


Before we present the solution, note that there are a couple of subtle yet important conditions in this problem. 
First, f might attain its maximum at one of the endpoints of [a,b], and there’s no reason to expect that f’ = 0 at 
these points, as the diagram on the left below shows. Second, f might not be differentiable at its maximum, as the 
diagram on the right below shows. 


y y 


Solution for Problem 3.22: Let c € (a,b) be the point where f attains its maximum. We write the definition of the 
derivative f’(c): 
+h) - 
Fy a a 


h-0 h 


Assuming that / is small enough so that c+h € [a,b], what do we know about the numerator of the above fraction? 
Since f(c) is the maximum value that f attains, we know that f(c +h) < f(c); in other words, the numerator is 
negative or zero. But the denominator is positive for h > 0 and negative for h < 0. Thus, by considering only 
positive values of h, we see that f’(c) < 0, and by considering only negative h, we see that f’(c) > 0. Thus, the only 
possibility is f’(c) = 0. O 


Of course, essentially the same argument works for the minimum value of f on [a,b]. We thus have the 
following theorem: 
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Important: —_Let f be a continuous function on [a, b]. 


e If f attains its maximum at c € (a,b), and f is differentiable at c, then 
fio=0. ; 


e If f attains its minimum at d € (a,b), and f is differentiable at d, then 
f'@ =0. 


Once again, remember the important “exceptions” to the above theorem: f might not be differentiable at its 
minimum or maximum, and the minimum or maximum might occur at one of the endpoints of the interval. 


If we change the conditions slightly to remove these exceptions, we get very useful theorem: 


Problem 3.23: Let f be a function that is continuous on [a,b] and differentiable on (a,b), such that f(a) = f(b). 


Show that there exists some c € (a,b) such that f’(c) = 0. 


Solution for Problem 3.23: Suppose c € (a,b) is such that f attains its maximum on [a,b] at c. Then we can apply 
Problem 3.22 (since f is differentiable at all points on (a, b)) and conclude that f’(c) = 0. 


Similarly, suppose c € (a,b) is such that f attains its minimum on [a,b] at c. Again we can conclude that 
f'(c) =0. 


What is the only other possibility? That’s if both the minimum and the maximum occur at the endpoints. But 
since f(a) = f(b), this means that the function is constant on [a,b], and thus f’ = 0 everywhere! 


So in all cases, f’(c) = 0 for some c € (a,b). O 


This result is known as: 


Important: _Rolle’s Theorem: If f is continuous on [a,b] and differentiable on (a,b), and 
Vv f(a) = f(b), then there exists c € (a,b) such that f’(c) = 0. 


Again, the easiest way to remember Rolle’s Theorem is to keep the picture 
to the right in mind—if a continuous function on [a,b] starts and ends at the 
same value f(a) = f(b), then it must “flatten out” at some point c with f’(c) = 0. 
It’s a very nice theorem, but it’s pretty specific: it only works if f(a) = f(b). 
We'd like a more general version that works on any differentiable function on 
an interval. 


Let’s consider an arbitrary function that’s continuous on [a,b] and dif- 
ferentiable on (a,b), as shown at right. If we tilt our head and pretend that 
the x-axis runs through (a, f(a)) and (b, f(b)), we get the idea that we should 
be able to find a point c € (a,b) such that the tangent line through (c, f(c)) is 
parallel to our new “x-axis.” 


What does this really mean? It means that the tangent line through , 
(c, f(c)) is parallel to the line through (a, f(a)) and (b, f(b)). And, of course, (a, f(a)) 
“parallel” means that the two lines have the same slope. If we put it all 
together, we have that 


fO)- fla) 


f'(c) = slope of line through (a, f(a)) and (b, f(b)) = ee 


ré°) 


CHAPTER 3. THE DERIVATIVE 


We're now ready to state the theorem, which is one of the most important theorems in calculus. 


The Mean Value Theorem: If f is a continuous function on the closed interval 
[a,b], with a < b, and f is Newpicusasenge? on ae, then there exists some real 
number c € (a,b) such that 


_ fe) - f@ f@ 
Sra 5 


The Mean Value Theorem is arguably the most important theoretical result about the derivative. You should 
think about the Mean Value Theorem until its statement becomes intuitively obvious to you. The proof is a little bit 
tricky, so we'll defer it to Section 3.B. We'll see more of the significance of the Mean Value Theorem in Chapter 4. 


EXERCISES 


3.5.1 Suppose that f is continuous on a closed interval [a,b] and that f’(c) = 0 for all c € (a,b). What can you 
conclude about f? Prove your result. Hints: 207 


3.5.2 Suppose f and g are differentiable functions such that f’ = g’. Show that f — g is a constant function. 
3.5.3 Let f bea degree n polynomial with n distinct real roots. Show that f’ has n—1 distinct real roots. Hints: 262 


3.5.4 Suppose that f is a differentiable function with domain R such that f’(x) > 0 for all x € IR. Show that if 
a,b € Rwitha < b, then f(a) < f(b). Hints: 117 


3.6 IMPLICIT DIFFERENTIATION 


Many curves are not defined as the graph of a function, but instead are defined implicitly as the graph of an 
equation involving x and y. For example, consider the circle with center (0,0) and radius 1. This circle is the set 
of all points (x, y) satisfying the equation x? + y* = 1. Since it fails the Vertical Line Test, we know that this circle 
is not the graph of any function. This presents an issue with an example like the following: 


Solution for Problem 3.24: Of course, we know from our knowledge of geometry and 
trigonometry that the slope of this line is -1. However, for the sake of this exercise, 
let’s compute this slope using calculus. Although the entire circle is not the graph of 
a function, the top half of the circle is the graph of a function, and we can compute the 
derivative of this function to get the slope of this line. 


If we replace y with f(x) in our equation for the circle, we get 


x? + (f(x)? =1, 


which we can solve to get 
f(x) = ha = 2, 


the graph of which is the top half of the circle. If we instead had taken the negative square root, we would have 
f(x) = — V1 — x2, whose graph is the bottom half of the circle. 


80 


3.6. IMPLICIT DIFFERENTIATION 


We can now use the Chain Rule to take the derivative of f(x) = (1 - x2)2: 


x 


VI-x 


fi) = 51-2) 4-28) = - 


We now see that 


viz 


so that the slope of the tangent line to the circle at the point (#2, ¥) is —1, as we expect. O 


But a simpler way of finding this derivative is to just take the derivative directly of the equation that implicitly 
defines the circle. Specifically, we start with 
x +(f@~P=1, 


and we take the derivative of both sides: 
(+ F@Y) = £00). 
Since the derivative is linear, we can break up the left side into a sum of derivatives. We also note that 4 (x?) =2x 
and £(1) = 0, so we have 
d 2 

2x + = ((F(x))?) = 0. 
How do we take the derivative of that remaining term? We can use the Chain Rule! This gives us 

2x + 2(f(x))(f"(x)) = 0. 


Now we solve for the derivative: 
x 


a aa 
2f(x) f(x)’ 


Now we see, for example, that at the point (#2, ¥) the derivative is -38 = —1, just as before. 


More typically, equations that implicitly define functions are written in terms of x and y. Thus, we would 
write the circle as 
rey =1, 


f'(x) = 


and implicitly differentiate to get 
2x +2yy’ =0, 


from which we could solve to get y’ = = 


Concept: The main thing to keep in mind when performing implicit differentiation is that 
taking the derivative (with respect to x) of any term containing a “y” is going to 


result in a y’ term being introduced. In other words, #y = y’. 


Here is a somewhat more complicated example: 


Problem 3.25: Find the equation for the tangent line to the curve 3y” — 6xy + 2x° — 5y = 2 at the point (2, 1). 


Solution for Problem 3.25: First, it’s a good idea to verify that (2, 1) is on the curve, which we can do by plugging 
(x, y) = (2,1) into the equation: 


3(1)* — 6(2)(1) + 2(2)° —5(1) = 3-12 + 16-5 =2. 
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Next, we compute the derivative of the equation that implicitly defines the curve. Again, the idea is simple: 
whenever we have a term with a y in it, we have to apply the Chain Rule to get a y’ term. Thus we get 


6yy’ — 6xy’ — 6y + 6x —5y’ =0. 


Don’t forget to differentiate the constant on the right side as well (to get 0). We could solve this for y’ in terms of 
x and y, but in this problem it’s simpler to plug in (x, y) = (2,1) right away to get the slope of the tangent at (2, 1): 


6y’ — 12y' -6 +24-5y' =0, 


so y’ = }8. Thus, in point-slope form, the equation of the tangent line is 


18 
y-1= iW ~2): 


QO 


EXERCISES 
3.6.1 Find the slope of the tangent line to the hyperbola x? — y* = 1 at the point (2, V3). 
3.6.2 Find the equation of the tangent line to the curve y’ — 3xy” + x? — xy = 6 at the point (—1, 1). 


d 
3.6.3 Find = if x2 + y = log(y? - 1). 


3.6.4 Find the slope of the tangent line to the curve x sin(x + y) = ycos(x — y) at the point (0, ). 


3.6.5 Assume that - sin 6 = cos 9. Use implicit differentiation to prove that 4 cos @ = —sin @. Hints: 258 


3.7 SUMMARY OF DERIVATIVE COMPUTATION 


Here is a summary of derivative rules and the catalog of common derivatives that we developed in this 
chapter. You should learn all of the items below, and practice until computing a derivative is as easy as addition 
or multiplication. 


Definition fla-+h) — f(a) 
a a — f(a 
ca h 
Linearity 
(f + g)'(x) = f’(x) + 9’(x) 
(cf)’(x) = cf’(x) 
Product Rule 
(Fay (x) = f’ g(x) + fx)g' (x) 
Quotient Rule 


(- ) = £28) — feds’) 
g (g(x))? 
Inverse Function Rule 1 

PO = FETED 
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Monomials F 
—x" = nx} 
dx 
Trig Functions 
4 (sin x) = cosx 4 (cos x) = —sinx 
4 (tan x) = sec? x 4 (cotx) = - csc? x 
4 (sec x) =secxtanx £ (csc x) = —cscxcotx 
Exponential and Natural Logarithm 
Z(e) =e 


4 (log x)= 1 
Chain Rule 
(g° f)' (x) =X F@)f' 
Mean Value Theorem: If f is continuous on [a,b] and differentiable on (a,b), then there exists c € (a,b) such 


a f(b) - fla) 
; be = a 
ie b-a ~ 


REVIEW PROBLEMS 


3.26 Compute the derivatives of the following: 


1 
(a) Ox+3p 6) Ver+e 
(c) log(sinx cos x) (d) ‘ 1+ s/x+ Ver 
(e) axe~’* where a and bare real constants (f) e~ 
(g) sin™! x? (h) sin’ x 


3.27 Find the equation of the tangent line to the graph y = (x — 1)° + 2 at the point (3, 10). 
3.28 Let f(x) = x? — 6x + 10 with domain [3, +00). Find (f~!)’(2). 
3.29 Find the tangent line to the curve y” = x° — 3x + 1 at the point (2, V3). 


3.30 Define fla+h)- fla—h) 
: a —fla- 
fr) =}, FO Fe—, 
Show that if f’(a) exists, then so does f*(a), and f*(a) = f’(a). 
3.31 


(a) Show that if f, g,h are differentiable functions, then 


(fgh)’ = f’gh+ fg’h+ fgh'. 
Hints: 191 
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(b) Suppose fi, fo,..-,f; are differentiable functions. Find a formula for (fi f2 «-- f,)’. Hints: 226 


3.32 Find a continuous function f for which the Mean Value Theorem fails. That is, find a continuous function 
f and a < b with [a,b] C Dom(f), such that there does not exist c € (a, b) satisfying 


7 _ fo)- f@ 
putes. 


Hints: 249 
3.33 Let f(x) =1+x+2x7 +--+ +x! Find f’(1). (Source: HMMT) Hints: 208 


CHALLENGE PROBLEMS 


3.34 Letabea positive real number. Find oa and é log, (x). Hints: 274 


3.35 Let f(x) = x° + ax + b, with a # b, and suppose that the tangent lines to the graph of f at x = a and x = bare 
parallel. Find f(1). (Source: HMMT) Hints: 166, 42 


3.36 Recall that a function f is even if f(—x) = f(x) and is odd if f(—x) = —f(x). Prove that if f is odd, then f’ is 
even, and that if f is even, then f’ is odd. 


3.37 Suppose the function f(x) — f(2x) has derivative 5 at x = 1 and derivative 7 at x = 2. Find the derivative of 
f(x) — f(4x) at x = 1. (Source: HMMT) Hints: 22, 213 


3.38 We write f(x) to mean taking the derivative of f(x) k times. Prove that if f(x) is a degree n polynomial, 
then f(x) = c where c is a nonzero constant. Hints: 140 


3.39 A polynomial function f(x) has a root r with multiplicity m if 
F(x) = (& — )"h(x) 

for some polynomial h(x) with h(r) # 0. 
(a) Prove that if f has root r with multiplicity m, then f’ has root r with multiplicity m — 1. 
(b) Prove that f(r) = 0 for all 1 <k < m. 
3.40 Suppose f and g are differentiable functions. 
(a) Show that 

(fg)” = fg t2fig’ + fg”. 
(b)* Generalize part (a) to show that for any positive integer n, 


n 


(f=) (rer 


k=0 
This is called the Leibniz Rule. Hints: 33, 127 


3.A PROOF OF THE CHAIN RULE 


Problem 3.41: Prove that if f and g are differentiable functions, and if a € R is such that (¢ 0 f)(a) exists, then 


(so fY@=s(F@f@. 
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Sidenote: The argument presented below is largely based on the argument in [Sp,Chapt 


SS 10]. There are other ways to prove this, but I find the argument in [Sp] to be the 
nicest I’ve seen. 


Solution for Problem 3.41: We start as we did on page 73, with the limit definition of (g 0 f)’(a): 


h)) - 
(go f(a) = tim SUE *™)~ 80) 


We want an f’(a) term to appear on the right side of the above, thus it appears that we'll need to have f(a+h) — f(a) 
in the numerator. So let’s multiply by it. Of course, if we multiply by it, we have to divide by it too: 


s(fa+h)~ s(f@) _ sfath)-sf@) fa+h-f@ 
h f(a+h) — f(a) h 
This is nice: the second fraction in (3.1) has limit f’(a) as h — 0. But we have a problem if some h # 0 satisfies 


f(a+h) — f(a) = 0. If this occurs, then the expression in (3.1) is not defined at that h, because the denominator of 
the first term is 0. 


(3.1) 


To be a little more precise about what the problem is here and how to fix it, suppose we define 
fla+h)—fa@) 
This is the first fraction on the right side of (3.1). We'd like to be able to show that lim r(h) = g’(f(a)). The problem 


is that r(h) is undefined wherever f(a +h) = f(a). This is not an issue at h = 0, Sabammtin we're taking a limit, but 
it is an issue if this happens at any h # 0. So what we'll do is use some “spackle” and fill in the “holes” in the 
definition of r that occur where f(a + h) = f(a). Specifically, we extend r(h) to a new function p(h) by: 


s(fa+h))-s(f@) . 
poh “Eee... Lore ese 
(f(a) if f(a+h) = f(a). 


Now p doesn’t have any holes, since we've filled the holes with the value 9’ (f(a)). 


r(h) = 


We now show that lim p(h) = g’(f(a)). This means that for any € > 0, we need to be able to find 6 > 0 such that 


O<Ih<5 = |p(h)-g(fa)i<e. (3.2) 
However, we know by definition that 


jim 


= g'(f(a)), (3.3) 


which means that we can choose C > 0 such that 


O<|kl<C 


s(f@) + = = s(f(@)) _ g(fla)| <e, (3.4) 


and we know that f is continuous at a, so we can choose 6 > 0 such that 


O<|xl<5d = ([f@+x)-f@| <<. (3.5) 


We claim that this is the 6 that satisfies (3.2). To see why, let h be given such that 0 < |h| < 6. If f(a+h) = f(a), then 
by definition p(h) = g’(f(a)) (remember, this is the “spackle”), so clearly 


lp(h) — 3’(F(@))| = le’ F@) - 8’(F@)| = 0 <e, 
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as needed. On the other hand, if f(a + h) # f(a), then we use (3.5) to get 0 < |f(a+h) — f(a)| < C. Then, plugging 
f(a +h) — f(a) in for k in (3.4), we get 


s(fa) + (Fa + h) — f@))) - sf@) 


Sens ea — g’(f(a))| <e. (3.6) 


But that fraction in (3.6) is just p(h): 


g(f(a) + (f(a +h) — f(@))) - gf@) _ g(fla+h)) - s(f@) 


fa +h) f(a) mae E3250 ee 


Thus, (3.6) becomes 
lp(h) — g(f(a@))l <e, 


which is what we needed to establish in (3.2). Therefore, we have shown that lim n p(ht) = x (f(a). 
Now we can prove the Chain Rule. We modify (3.1) by observing that, for all h # 0, we have 
g(f(a+h)) - g(f@) _ h fa+h)- f@ 
eT ian p(h) cats ames 


because if f(a +h) # f(a), this is the same expression that we had in (3.1), and if f(a +h) = f(a), then both sides of 
the above are 0. Thus, we can now finally complete the computation of the derivative of g 0 f: 


(go f(a) = 
“rao 228 fi) 
= (im pt): (i fla+h)- ae) 


= g'(f(a))f'(@)- 


3.B PROOF OF THE MEAN VALUE THEOREM 


Problem 3.42: The Mean Value Theorem: Let f be a continuous function § Y 
on [a, b] that is differentiable on (a,b). Show that there exists a point c € (a, b) 


such that 
f(b) = fla) 


Fe b-a 


Solution for Problem 3.42: Recall that the expression A AL A fa) ; is the slope (a, f(a)) 

of the secant line between (a, f(a)) and (b, f(b)), shown by the dashed line 

in the picture to the right. The Mean Value Theorem asserts that there is 

a point (c, f(c)) on the graph of y = f(x) such that the tangent line at (c, f(c)) is parallel to the secant line from 
(a, f(a)) to (b, f(b)), as in the picture at right. Also note that if f(a) = f(b), then the Mean Value Theorem asserts 
that f’(c) = 0; in other words, we just have Rolle’s Theorem. 
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We'd like to define a new function g with 9(a) = g(b) in order to apply Rolle’s Theorem. It’s not immediately 
clear how we might do this. We might as well take g(a) = f(a), and then we will aim for g(b) = g(a) = f(a). A 
simple approach is to try to have some sort of linear “correction” of the form 


g(x) = f(x) — (x - a)(something), 
noting that g(a) = f(a) + (a —a)(something) = f(a). Plugging in x = b gives 
g(b) = f(b) — (b — a)(something). 

fO)-f@ 


We want g(b) = g(a) = f(a), so we need the “something” to be iat a which happens to be the slope of the 
secant line from (a, f(a)) to (b, f(b)). Therefore, our new function is 


g(x) = f(x) - FOF 5 _ 

Notice that g is continuous on [a,b] (since it is a linear combination of the continuous functions f(x) and (x — a)), 
and that 

b) - 

ga=fa)- 9 
so g is differentiable on (a,b). Also we verify that g(a) = f(a) and 
b 
300) = fe) - 2-196 a) = fa, 


so g(a) = g(b). Thus, we may apply Rolle’s Theorem to g and find a point c € (a,b) such that 9’(c) = 0. Finally, 


fO)- f@ _ fe)-f@ 


Poage es ee = hee 


as desired. 0 
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APPLICATIONS OF THE DERIVATIVE 


In Chapter 3 we introduced the concept of derivative in terms of geometry: the derivative of f(x) at a point 
(a, f(a)) on the graph of f is the slope of the tangent line to the curve at that point. However, if this is all that the 
derivative were good for, then calculus would be an interesting geometrical curiosity, and not the fundamental 
part of mathematics that it is today. Indeed, the derivative has many more applications than just finding slopes of 
tangent lines. We'll explore these applications in this chapter. 


4.1 GRAPHICAL INTERPRETATION OF THE DERIVATIVE 


We defined the derivative f’(a) of a function f at a point a € Dom(f) to be the slope of the tangent line to 
the graph y = f(x) at (a, f(a)). This right away gives us some pretty important qualitative information about the 
function’s behavior at x = a. Let’s start with some definitions: 


Definition: Let f be a function. 
e f is increasing on an interval I if for all a,b € I witha < b, we have f(a) < f(b). 


e f is decreasing on an interval | if for all a,b € I with a < b, we have f(a) > f(b). 


In either case we say that f is monotonic on I. If I is the entire domain of f, then we can simply say that f is 
increasing (or decreasing). 


Intuitively, the graph of an increasing function is “always rising” and the graph of a decreasing function is 
“always falling.” In other words, the graph of an increasing function “moves upwards” as we move from left 
to right, and the graph of a decreasing function “moves downwards.” However, because the inequalities in our 
definition above are not strict (that is, we used “<” rather than “<” in the definition of increasing), the function 
might also “level off” at some point; indeed, under our definition, the constant function f(x) = c is both increasing 
and decreasing. 
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We can modify our definition slightly: 


Definition: Let f be a function. 
e fis strictly increasing on an interval | if for all a,b € I witha < b, we have f(a) < f(b). 


e fis strictly decreasing on an interval J if for all a,b € I witha < b, we have f(a) > f(b). 


In either case we say that f is strictly monotonic on J. If J is the entire domain of f, then we can simply say 
that f is strictly increasing (or strictly decreasing). 


The only difference between “increasing” and “strictly increasing” is that in the latter, we do not permit the 
function to “level off.” 


If the function is differentiable, then the derivative gives us a nice test for monotonicity. 


Problem 4.1: Let f be a differentiable function. 
(a) Show that if f’(x) > 0 for all x € Dom(f), then f is strictly increasing. 


(b) Replace the “>” in the above expression with “<”, “>”, or “<” and determine what conclusion can be 
made. 


Solution for Problem 4.1: 


(a) Thinking of a picture gives us the idea: we can see that since the slope of any tangent line is always positive, 
the function is always “rising” as we move from left to right. Although this is a good intuitive way to think 
about what's going on, it is not a rigorous proof by any means. So how can we prove that the statement is 
true? 

Let’s use of definition of strictly increasing to set up a rigorous proof. We are given two values a,b € 
Dom(f), with a < b, and we need to show that f(a) < f(b). Hmmm... do we know any expressions that 
relate the quantities a, b, f(a), f(b), and f’? 


Concept: When trying to prove something about various quantities, think if you know any 


formulas, theorems, or previously-solved problems that involve these quantities. 


The Mean Value Theorem does the trick! Recall that it states that there must be some c € (a,b) such that 


7 i f (b) ms f (a) 
But we are given that f’(c) > 0, and we know that b—a > 0. So we must also have that f(b) — f(a) > 0, meaning 
that f(a) < f(b), and thus we conclude that the function is strictly increasing. 


(b) Replacing the inequality changes what we know about f’(c) in our Mean Value Theorem expression, but the 
key fact is that since b — a > 0, then the sign of f’(c) (whether it is positive, negative, or 0) will always match 
the sign of f(b) — f(a). So we can make the following table: 


fi(c) >0 | f(b) -— f(a) > 0 | f is strictly increasing 
fi(c)=0 | f(b) — f(a) = 0 | f is increasing 
f'(c) <0 | f(b) — f(a) < 0 | f is strictly decreasing 
fi(c) <0 | f(b) -— f(a) <0 | f is decreasing 


oO 


More generally, looking at specific intervals (and not necessary the entire domain of f), we have the following 
result: 
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Let f be a continuous differentiable function on an interval [a,b]. 


Important: 


e If f’(x) > 0 for all x € (a,b), then f is strictly increasing on [a,b]. 
e If f’(x) = 0 for all x € (a,b), then f is increasing on [a,b]. 

e If f’(x) <0 for all x € (a,b), then f is strictly decreasing on [a,b]. 
e If f’(x) < 0 for all x € (a,b), then f is decreasing on [a,b]. 
This is our first glimpse of how we can interpret the derivative as the rate of change of f. Basically, f’ is 


measuring the rate at which f(x) changes as we increase x. If f’ > 0, then the rate of change is positive, and the 
function is increasing. If f’ < 0, then the rate of change is negative, and the function is decreasing. 


Thinking of the derivative as a rate of change is probably the most important 
application of the derivative. We will see this interpretation throughout this 


chapter, and indeed throughout calculus. 


Problem 4.2: Let f be a differentiable function with f’(x) = 0 for all x € (a,b). What can we conclude about f? 


Solution for Problem 4.2: Because f’ > 0, the function is increasing on [a,b]. This means that f(c) < f(d) for all 
a<c<d<b. Onthe other hand, because f’ < 0, the function is decreasing on [a,b]. This means that f(c) > f(d) for 
alla <c<d<b. The only way that we can have f(c) < f(d) and simultaneously have f(c) > f(d) is if f(c) = f(d). 
But this is true for all c,d € [a,b]. Hence, the function f is constant on [a,b]. 


Also, this makes sense as a rate of change: if f’ = 0, then the rate of change of f is 0. That is, the function isn’t 
changing at all! The only functions that don’t change at all are constant functions. 0 


One immediate application of the sign of the derivative is that it lets us quickly make a rough sketch of the 
graph of an unfamiliar function. For example: 


Solution for Problem 4.3: We can start by finding a few “easy” points. One point is 
particularly easy: we plug in x = 0 to get f(0) = 1, so (0,1) must be on the graph. 
Also easy are f(1) = —1 and f(—1) = —3, so (1,—1) and (—1,-3) are also on the 
graph. We plot these points on the graph at right. 


Our next step is to find where the function is increasing and where it is 
decreasing. For this, we compute the derivative: 


f' (x) = 3x? — 6x. 


The zeros of 3x? — 6x = 3x(x — 2) are at x = 0 and x = 2. Further, we can see that (“1 3) 
f'(x) > 0 on (00,0) and on (2, +00), and f’(x) < 0 on (0,2). We thus conclude that Piece 

f(x) is increasing from —co up to 0, then decreasing from 0 to 2, then increasing 

from 2 to +co. Sometimes this information is represented as a sign graph for the derivative, as follows: 
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++teetteee0---O+¢4¢4+f' 
0 2 
This little chart just indicates the sign of the derivative. Where it’s “+”, the function f is strictly increasing, 


and where it’s “—”, the function f is strictly decreasing. 


Next, since the function changes from decreasing to increasing at x = 2, we probably want to calculate 
f(2) =8-—12+1 =-3. So the graph passes through (2, —3). 


Therefore, the graph increases up to (0,1) (passing through (—1, —3) on the way), then decreases down to (2, —3) 
(passing through (0,1) and (1,—1) on the way), and then increases towards +00. 


Putting all this information together, we get our sketch: 


y 
p increasing 
0 f()=1 
(0,2) decreasing x 
2 f(2)=- 
(2, +00) increasing 


Finally, we can use this sketch to get information about the roots of the cubic. By the Intermediate Value 
Theorem, we know that one root is between —1 and 0, one root is between 0 and 2, and one root is greater than 2. 
O 


We might ask a similar question about the derivative itself: when is it increasing or decreasing? To answer 
this, we start with the function f’, and we look at its derivative (f’)’ = f”. This is the second derivative of the 
original function f. (Analogously, we sometimes call f’ the first derivative of f.) 


Let’s continue with the example from Problem 4.3. 


Problem 4.4: Let f(x) = x° — 3x7 +1. 
(a) Where is f’ increasing? Where is it decreasing? 


(b) What does this information say about the shape of the graph of p 


Solution for Problem 4.4: 


(a) We had f’(x) = 3x? - 6x, so f(x) = 6x — 6. This is positive for x > 1, zero at x = 1, and negative for x < 1. 
So when x < 1, the derivative is decreasing, and when x > 1, the derivative is increasing. We can add this 
information to our sign chart: 
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wu 
i OF + +b + 
v 
Peg aS se oS ee 


~—_++++++—+_}-+-"_ + ++ "++ ++ 
6. ? Atye;-3 


+ + 
+ + 


We notice that f’ starts positive, but is decreasing. At x = 0 we have f’ switching from positive to negative, 
and still decreasing. At x = 1 we see that f’ switches from decreasing to increasing; note f’ is still negative. 
At x = 2, while f’ is increasing, it switches from negative to positive, and continues to increase for x > 2. 


(b) Wecan see the effect of f’” on the graph of f: 


Interval 


C3 
increasing | decreasing 
(0,1) decreasing | decreasing 
(1,2) decreasing | increasing 
(2, +00) increasing | increasing 


We have drawn some segments of the tangent lines as various points on the curve. We can see that as we 
move along the curve from left to right towards x = 1, the slopes of the tangent lines are decreasing, because 
f’ <0. The fact that the derivative is decreasing means that the tangent lines to the curve are decreasing in 
slope, which means that they are rotating clockwise as we move along the curve. 


Another way that you can think of this is that if you are a person walking along the curve in the increasing 
x direction, then the curve is bending to the right as you walk along it. In our example, if we walk along the 
curve from (—1, —3) through (0, 1) and towards (1, —1), the “path” provided by the curve is gradually bending 
to the right. (This is most noticeable in the big, sweeping right curve that we have to make near x = 0.) 
The curve bends to the right because the tangents are decreasing in slope, so their directions are rotating 
clockwise, which is towards the right of your direction of travel. 


We say that the portion of the function where f” < 0 is concave down. Basically, where f is concave down, 
the graph of f is “bending downwards” as it moves from left to right. This bending can happen whether the 
curve is increasing or decreasing, as we see in our example above: the function is increasing up to x = 0, then 
decreasing between x = 0 and x = 1, but the whole time it is concave down. 


Conversely, if f” > 0, then the derivative is increasing. This means that the tangent lines are increasing 
in slope, which means that they are rotating counterclockwise. In our example, this occurs when x > 1. We 
add more tangent line segments to our picture to illustrate this behavior: 
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Interval 


R 


increasing 
decreasing 
decreasing 
increasing 


decreasing 
decreasing 
increasing 
increasing 


(0, 1) 
(1,2) 
(2, +00) 


We say that the function is concave up where f” > 0: the graph is “bending upwards” as x increases. 
Again, this bending can happen whether the function is increasing or decreasing. 


Finally, the concavity of the graph may change (from up to down or vice versa) at points where f’’(x) = 0. 
Any point at which the concavity of the graph changes is called an inflection point. In our f(x) = x° —3x* +1 
example, we have f’(1) = 0, and the concavity of the graph changes from concave down (for x < 1) to 
concave up (for x > 1). Thus (1,—1) is an inflection point; moreover, since it is the only value of x such that 
f(x) = 0, it is the only inflection point. 


To summarize some of the definitions and concepts from Problem 4.4: 


Definition: Let f be a differentiable function. 
e If f(x) > 0 for all x in some interval I, we say that f is concave up on I. 


e If f(x) < 0 for all x in some interval I, we say that f is concave down on I. 


e If the concavity of f switches from up to down (or vice versa) at some point x, then we say (x, f(x)) is an 


inflection point of the graph of f. 


Problem 4.5: Suppose f is a twice-differentiable function and f” is continuous. 
(a) Prove that if (a, f(a)) is an inflection point, then f” (a) = 0. 
(b) Find an example where f”(a) = 0 but (a, f(a)) is not an inflection point. 


Solution for Problem 4.5: 


(a) If (a, f(a)) is an inflection point, then we know that the concavity of the function changes from up to down 
(or from down to up) at a. Suppose it changes from up to down. Then we know that f”(c) > 0 forc <a 
and f’(c) < 0 for c > a (both of these statements assume that c is sufficiently close to a). But since f” is 
continuous, we can’t skip from f” > 0 to f” <0 without some point at which f” = 0; this is the Intermediate 
Value Theorem. So we must have f’’(a) = 0. The proof where the concavity of f switches from down to up 
is essentially the same. 


Note that we can get into some trouble if f” is not continuous—we can have f” jump directly from 
negative to positive (or vice versa) without attaining the value of 0. Such examples are a bit contrived, 
though; we will leave it as an exercise to try to construct one. 


(b) The concavity doesn’t have to change at a just because f”(a) = 0. There are many examples of this. One 
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simple example is f(x) = x* at a = 0. We see that f’(x) = 4° and f”(x) = 12x*. Since f” > 0 for all nonzero 
x, we know that f is concave up everywhere except at x = 0. Thus, even though f”’(0) = 0, we don’t have an 
inflection point at (0,0), since the function is concave up on both sides of x = 


There are some other geometric ways to describe concavity. 


Problem 4.6: Let f be a twice-differentiable function with continuous first and second derivatives, and suppose 
that a € Dom(f) is such that f” > 0 on some interval I containing a. Show that the tangent line to the graph of 


f at (a, f(a)) lies below the graph of f along the interval I. 


Solution for Problem 4.6: Sketching a picture gives us an idea. We can see from the y 
picture that if the curve is concave up, then a segment of the tangent line should lie 
below the curve. But how do we prove this rigorously? 


Suppose that b is another point in J. Assume that b > a (the proof for b < a is 
essentially the same). We need to show that f(b) is greater than the y-coordinate of 
the point on the tangent line with x-coordinate b. This point lies on the line through 
(a, f(a)) with slope f’(a), and this line has equation y— f(a) = f’(a)(x—a) in point-slope 
form. Therefore the point on this tangent line with x-coordinate b has y-coordinate 


f(a) + f'@(b -4). 


Thus, in order to show that the graph lies above the tangent line when x = b, we 
have to show that f(b) > f(a) + f’(a)(b — a). Let’s rewrite this expression with all the terms on the same side: 


f IS! (a) 


Sa 


— f'(a) > 0. 


That fraction on the left looks awfully familiar! It’s once again the expression from the Mean Value Theorem. So 
we know that there is some c between a and b (so that a < c < b) such that 


po = LOL, 


Now we just have to prove that f’(c) — f’(a) > 0, or f’(c) > f’(a). But that’s precisely the definition that f’ is strictly 
increasing, and we know /’ is strictly increasing since f” > 0 on I. So we’re done. 0 


Not surprisingly, if a curve is concave down, then the tangent lines lie above the y 
curve, as shown in the picture to the right. The proof is essentially the same as that of 
Problem 4.6, but with the inequalities reversed. We’ll leave the details of the proof of 
this as an exercise. 


On the other hand, if f is concave up between a and b, what can we say about its 
secant lines? Looking at the picture on the next page, they appear to all lie above the 
graph. This is a reason why concave up portions of the graph of f are also called convex: 
because the area of the plane above the graph contains all its secant lines, and is thus a 
convex region of the plane. Similarly, secant lines of concave down graphs lie below the 
graph. This is also shown in the picture on the next page. 


R 
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We'll leave the proofs of these facts as exercises. They use a similar Mean Value y 
Theorem argument as the proof in Problem 4.6. 


We can summarize the features of concavity in the following chart: 


Concave up Concave down x 


peu frat 


f’ strictly increasing f’ strictly decreasing 


Tangent lines rotate counterclockwise | Tangent lines rotate clockwise 


as x increases as x increases 
Person walking on the graph Person walking on the graph 
is turning left is turning right 
Tangent lines below curve Tangent lines above curve 
Secant lines above curve Secant lines below curve 


Concept: Don’t memorize the above table! Learn and understand the geometric features of 


the second derivative. Keeping a basic example like f(x) = x° - x in mind is also 
very helpful. 


EXERCISES 
4.1.1 Sketch the graphs of the following functions. Find the intervals in which the functions are increasing or 
decreasing, the intervals in which the functions are concave up or down, and any inflection points. 

x 
x +2 
4.1.2 Below is the graph of a function f. Determine the sign (positive, negative, or zero) of f’ and f” at each 
labeled point (or state that they are undefined). 


(a) 3x°-4x+3 (b) 


(c) x+sinx (da) xe 


y 


4.1.3 Show that if f is strictly monotonic, then f has an inverse. Is the same true if f is monotonic but not 
necessarily strictly monotonic? 


4.1.4 Prove the reverse of Problem 4.6: Let f be a twice-differentiable function with continuous second derivative, 
and suppose that a € Dom(f) is such that f” < 0 on some interval J containing a. Show that the tangent line to the 
graph of f at (a, f(a)) lies above the graph of f along the interval I. 
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4.1.5x Prove that if f is concave up on some interval I, and a,b € I, then the secant segment between (a, f(a)) 
and (b, f(b)) lies above the graph of f. Also prove that if we replace “concave up” with “concave down,” then the 
secant segment lies below the graph of f. Hints: 147, 186, 99 


4.2 EXTREMA AND OPTIMIZATION 


Probably the most common use of derivatives (at least in an introductory calculus course) is to find minimum 
and/or maximum points of functions. 


From what we've already seen in Chapter 2, we know that if f is continuous on an interval [a,b], then f must 
attain a minimum and a maximum on [a,b]. These values are known as extreme values or more simply extrema. 
What we'd like to do is use the derivative to find the extrema. A sketch gives us a good idea of what’s going on: 


y 


We've sketched a continuous function on [a,b] with maximum value f(c) on [a,b], and drawn the tangent line 
to the graph at the point (c, f(c)). We can see pretty clearly that the tangent line is horizontal. Let’s formally state 
this and prove it: 


Problem 4.7: Let f be a function defined on [a,b] and suppose that c € na is such that f(c) is the maximum 


value of f on [a,b]. Show that if f is differentiable at c, then f’(c) = 0. 


Solution for Problem 4.7: We write the definition of the derivative f’(c): 


fo) = lim H+ fo) 


Assuming that is small enough so that c + h € [a,b], what do we know about f(c + h) — f(c)? Since f(c) is the 
maximum value that f attains, we know that f(c +h) < f(c); in other words, the numerator above is always 
negative (or zero). But the denominator is positive for h > 0 and negative for h < 0. 


Using the properties of limits, this tells us that 
ORF i — ivr. 
= 


and 
aE (c+ = f ©). 
ha 
But we know that both of these quantities exist and are ee to f’(c). Thus, we have f’(c) < 0 and f’(c) > 0,so the 
only possibility is f’(c) = 
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Of course, essentially the same argument works for the minimum value of f on [a,b]. So we have: 


Important: Let f be a function defined on [a,b]. If c € (a,b) is such that f(c) is an extreme 


value (a maximum or minimum) of f on [a,b], and f is differentiable at c, then 
foO=9. Was 


Note that there are a couple of subtle yet very important conditions in this statement, as we will see in the next 
problem: 


Problem 4.8: 
(a) Consider f(x) = x? on the interval [-1,2]. What is the maximum value of this function on [-1, 2]? Is f’ = 0 
at this point? Why or why not? 


(b) Consider f(x) = |x| on the interval [—1,2]. What is the minimum value of this function on [—1, 2]? Is f’ = 0 
at this point? Why or why not? 


Solution for Problem 4.8: 


(a) The function clearly has its maximum on [-1,2] at f(2) = 4. But f’(x) = 2x, so f’(2) = 4 # 0. We don’t have 
f’ = Oat the maximum point, but this is because 2 is the endpoint of our interval [—1,2], and there’s no reason 
to expect that f’(2) should be 0. 


(b) The function clearly has its minimum on [—1,2] at f(0) = 0. But f’(0) is undefined for this function: the graph 
has a “sharp corner” at (0,0), and thus no uniquely-defined tangent line. 


O 


As we saw in Problem 4.8, there are a couple of situations in which we might have an extreme value at c but 
not have f’(c) = 0. First, f might attain an extreme value at one of the endpoints of [a,b], and there’s no reason to 
expect that f’ = 0 at these points, as in the picture on the left below. 


Second, f might not be differentiable at an extreme value, as in the picture on the right above. Note that there 
is a “sharp corner” in the graph at the maximum point, so the tangent line there is not uniquely defined, and thus 
the function is not differentiable at that point. 


So here is the final word on extrema of continuous functions on closed intervals: 
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Important: Let f be a continuous function on [a,b] and suppose that f(c) is an extreme 
value (maximum or minimum) of f on [a,b] for some c € [a,b]. Then one of the 
following must be true: 


1. c € (a,b) and f’(c) =0 
2. c € (a,b) and f’(c) is undefined 


3 C=f0re =p 


Points satisfying any of conditions 1, 2, or 3 above—that is, points where the derivative is 0 or undefined, and 
the endpoints of the interval—are called critical points of the function on the interval [a,b]. These are the only 
points at which the function can attain its maximum or minimum. 


Let’s see an easy example of how we can use critical points to find the minimum and maximum of a function 
on a closed interval. 


Problem 4.9: Let f(x) = x? -2x+4. We wish to find the minimum and maximum values of f(x) on the interval 
[0, 3]. 
(a) First, solve this problem without using calculus—just use basic algebra (no derivatives). 


(b) Then, describe how calculus lets us solve the problem. 


Solution for Problem 4.9: 
(a) To better understand quadratic functions, we can complete the square: y (3,7) 


f(x) =x? —2x +45 (x? - 2x41) +3 =(x-17% +3. 


Now it is clear that f(x) is minimal when x—1 = 0, since the (x— 1)? term is always 
nonnegative. So f attains its minimum when x = 1, and the minimum value is 
fl) =3. 

The (x — 1)? term will be maximal when |x — 1| is as large as possible; that is, 
when x is as far away from 1 as possible. On [0,3], the number that is farthest 
away from 1 is 3. So, f attains its maximum at x = 3, and the maximum value is 
f(3) = 7. 

(b) Maxima and minima can only occur at critical points of the interval. The two 
endpoints (x = 0 and x = 3) are critical points, and to find any others, we compute 
the derivative: f’(x) = 2x — 2. This is always defined, and is 0 only at x = 1. So 
the critical points are 0, 1, and 3. We now just need to check the values of f at these points: 


f(0) =4, f(1) =3, f(3) =7. 


So the minimum value of 3 is at x = 1, and the maximum value of 7 is at x = 3. Since the function is 
continuous, the Intermediate Value Theorem tells us that f([0,3]) = [3,7]. 


(0,4) 


Just because we have calculus, we don’t have to use it. If there are simpler 


algebraic techniques that you can use—such as completing the square—go ahead 
and use them! 


Now we have a basic procedure for finding the maximum and minimum values of a continuous function on a 
closed interval: 
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Concept: Suppose f is continuous on [a,b]. The extreme values (that is, maximum and 


minimum value) of f on [a,b] must occur at a critical point, which is one of 


e A point c € (a,b) where f’(c) = 0, 
e A point c € (a,b) where f’(c) is undefined, or 
e A boundary point of [a,b], namely c = a orc = b. 


To find the maximum or minimum of f on [a,b], we find the critical points of 
f and compute the value of the function at each critical point. The largest such 
value will be the maximum, and the smallest will be the minimum. 


Let’s try another basic example, one that would be hard to approach without calculus. 


Solution for Problem 4.10: First, we find the critical points. The endpoints x = 0 and x = 3 of the interval are 
automatically critical points. The others are the roots of 


f'@) =32 -5. 


The roots of this are x = + ,/3, but only x = Vi lies in the interval [0,3]. 


Now that we have the critical points, we can compute the values of the function at them: 


f(0) = 0 — 5) =0, 


i(v3)-@)-0) --0) 


f(3) = 3° — 5(3) = 12. 


We now clearly see that f(3) = 12 is the maximum and f ( V3) =- (2) Vi is the minimum on the interval [0,3]. 
E] 


Our previous examples of finding extreme values considered functions defined on a closed interval. Let's 
widen our view a bit and look at the entire function. The maximum or minimum that the function obtains over 
its entire domain is called the global maximum or global minimum, as follows: 


Definition: Let f be a function. 
e We say that f has a global maximum (or just a maximum) at c if f(c) > f(x) for all x € Dom(f). 


e We say that f has a global minimum (or just a minimum) at c if f(c) < f(x) for all x € Dom(/). 


Of course, with many functions (such as f(x) = x for example), there is no guarantee that there will be any 
global extreme values at all. 


Instead of finding a global maximum or minimum, we are sometimes interested in finding what we call a local 
maximum or local minimum. A function f has a local maximum at c if f(c) is the largest possible value of f(x) 
for x “near” c. Of course, we'll have to be a bit more rigorous about what we mean by “near”: 
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Definition: Let f be a function. 


e Wesay that f has a local maximum (or relative maximum) at c if there exists an open interval I containing 
c such that f(c) > f(x) for any x € (IN Dom(f)). 


e Wesay that f hasa local minimum (or relative minimum) at c if there exists an open interval J containing 
c such that f(c) < f(x) for any x € (IN Dom(f)). 


The somewhat weird condition x € (IM Dom(f)) allows us to have local maxima or minima at the “endpoints” 
of the domain of f. For example, the function f(x) = Vx has domain [0, +00), and it has a local minimum at x = 0, 
since for any open interval J containing 0, we have 0 = f(0) < f(c) for any c € (IN [0, +09)). 


It is not always easy to find a global maximum or minimum. However, we can use our calculus tools to find 
local maxima and minima of a continuous function. 


Problem 4.11: Suppose that f is a continuous function with a local maximum or minimum at c. Show that 
f’(c) = 0 or f’(c) is undefined. 


Solution for Problem 4.11: By definition, there is an interval (a,b) such that f(c) is an extreme value of f on (a,b). 
But this means that c must be a critical point of f on [a,b], and since c is not one of the endpoints, we must have 
either f’(c) = 0 or f is not differentiable at c. 0 


Thus, if f is differentiable, we know that f’(c) = 0 at any point c (other Y 
than an endpoint of the domain of f) that gives a local maximum or minimum. 
However, we can actually say something a bit stronger that this. Let’s again 
look at our sketch of a local maximum, as shown at right. We see that f is | 
increasing to the left of c and decreasing to the right of c. This makes perfect ' 
sense: a function increases up to a local maximum, and then decreases. So, | 
from this sketch, we expect f’(x) > 0 for sufficiently close x < c, and f’(x) < 0 


x 
for sufficiently close x > c. . . b 


Sadly, this is not always true. We will leave it as a (hard) exercise to explore an example of a continuous 
differentiable function with a local maximum that does not have the above property. 


Sketching a quick example is great to get an intuitive insight of a problem or 


WARNING! 
+ concept, but a sketch is never a proof. 


However, the converse of the above discussion is true. Let’s write it formally and prove it, using our basic 
knowledge of the derivative: 


Problem 4.12: Suppose c is a point such that f’(c) = 0 and there exists some € > 0 such that: 
1. f’(x) 2 0 for all x € (c—€,c), and 


2. f’(x) <0 for all x € (c,c + €). 


Prove that f(c) is a local maximum. 


Solution for Problem 4.12: Since f’ => 0 on some interval (c — €, c) to the left of c, we know that f is increasing on this 
interval. Specifically, we know that f(x) < f(c) for x € (c —,c). Similarly, since f’ < 0 on some interval (c,c + €) 
to the right of c, we know that f is decreasing to the right of c. Specifically, f(c) > f(x) for x € (c,c + €). But this 
means that f(c) > f(x) for all x € (c— e,c + €). This is precisely the definition of a local maximum. 0 
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We can reverse all of the inequalities in the solution to Problem 4.12 to get a comparable result for local minima. 
Putting this all together, removing the € notation, and giving it a memorable name, we have: 


Important: The First Derivative Test: Let f be a continuous differentiable function, and let 


c € Dom(f) be such that f’(c) = 0. Then: 


e If f’ = 0 onan open interval to the left of c and f’ < 0 on an open interval 
to the right of c, then f(c) is a local maximum. 


e If f’ < 0 onan open interval to the left of c and f’ > 0 on an open interval 
to the right of c, then f(c) is a local minimum. 


Again, don’t memorize the test as it’s written above. Think about what it means. If a function increases (so 
that f’ > 0), then “stops” (so that f’ = 0), then decreases (so that f’ < 0), then the point at which it “stops” must 
be a local maximum. Reversing all this (so that f is decreasing, then “stopping,” then increasing) gives a local 
minimum. 


There’s another way that we can tell how f’ is behaving, and that is by looking at f”. 


Solution for Problem 4.13: As usual, we can first see this visually—we sketch 
a picture at right. Since f’(c) > 0, the function f is concave up at c, and if a 
function is concave up, then any critical point must be a local minimum, as 
shown in the picture. 


But, as usual, a sketch is not a proof—we want to prove it rigorously. So, 
we look at the definition of the second derivative: 


fi(c+h)- f’(c) 
sedan aaeaals 


f"(©) = lim 
We know that f’(c) = 0 and f”(c) > 0, so we have 


ACRE oe ic a np ca 


~. f () 7 oo h-0 h 


Since this limit is positive, the fraction must be positive for both positive and negative h that are sufficiently close 
to 0. In particular, f’(c + h) > 0 when h > 0, and f’(c +h) < 0 whenh < 0. 


But this means that f’ < 0 to the left of c and f’ > 0 to the right of c. This is exactly the condition of the First 
Derivative Test! Thus, we see that c is a local minimum. (Again, think of this informally: f is decreasing as it 
approaches c, then “levels out,” then increases as it moves past c.) 0 


Combining this result with the similar result where f’’(c) < 0 gives: 


Important: |The Second Derivative Test: suppose f is a continuous differentiable function 
Vv and c is such that f’(c) = 0. 


e If f’(c) > 0, then f(c) is a local minimum. 


e If f’(c) <0, then f(c) is a local maximum. 
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You may wonder what happens in the Second Derivative Test if f”(c) = 0. Unfortunately, in this situation, 
we can’t conclude anything. For example, f(x) = x° has f’(0) = f’’(0) = 0, but (0,0) is neither a minimum nor a 
maximum. On the other hand, f(x) = x* also has f’(0) = f”(0) = 0, but (0,0) is a local minimum point. (In fact 
it’s a global minimum.) Similarly f(x) = —x* has f’(0) = f’(0) = 0, but has a maximum at (0,0). Thus, looking at 
these examples, we see that c such that f’(c) = f’’(c) = 0 might be a local maximum, a local minimum, or neither, 
and hence f’(c) = f’’(c) = 0 gives us no information. 


We use our knowledge of critical points, along with the First and Second Derivative Tests, to help us solve 
optimization problems. In an optimization problem, we are trying to maximize or minimize a certain quantity, 
often subject to additional constraints. 


However, often our domain or our function will be unbounded, so we will find the following lemma useful: 


Problem 4.14: Let f be a differentiable function with [a,b] C Dom(f) such that f’(x) < 0 for all x < a and 


f’(x) 2 0 for all x > b. Show that f has a global minimum f(c) with c € [a,b]. 


Solution for Problem 4.14: Since f is continuous, we know that f has a minimum on [a,b]—that is, there exists 
some c € [a,b] such that f(c) < f(x) for all x € [a,b]. In particular, f(c) < f(a) and f(c) < f(b). 


Further, for any x < a, we know that f is decreasing on [x,a] (since f’ < 0 on this interval), so f(x) => f(a) = f(c). 
Similarly, for any b < x, we know that f is increasing on [b,x] (since f’ > 0 on this interval), so f(x) > f(b) = f(c). 


Combining all this, we have f(x) > f(c) for all x, and thus f(c) is a global minimum. 0 


Intuitively, the result of Problem 4.14 should make sense. It says that if we have a function that decreases until 
x = a and then increases after x = b, then whatever minimum occurs on [a,b] is actually the minimum for the 
entire function. Reversing our assumptions gives the opposite result: that is, if we have a function that increases 
until x = a and then decreases after x = b, then whatever maximum occurs on [a,b] is the maximum for the entire 
function. 


This result is quite useful for optimization problems in which the domain is something other than a closed 
interval, as in the following example. 


Problem 4.15: We wish to build a box with a square lid and a volume of 224 cubic inches. The material to 
build the lid of the box costs $5 per square inch, and the material to build the sides and bottom costs $2 per 
square inch. What is the cheapest box (including lid) that we can build? 


Solution for Problem 4.15: Our goal is to minimize the cost of the box subject to the constraint that the volume is 
equal to 224 square inches. We begin by assigning variables for the relevant quantities. Let x be the length (in 
inches) of a side of the lid (and thus also the length of a side of the bottom), and let h be the height (in inches) of 
the box. Note that the bottom and the lid will each have area x? (in cubic inches) and each of the four sides will 
have area xh. We can now write an expression for the cost of the box: 


Cost = 2(x? + 4xh) + 5x? = 7x? + 8xh. 


Unfortunately, this expression has two variables in it. But we can use the volume constraint to eliminate one of 
the variables. In particular, the volume constraint means that 224 = x*h, so h = 224/x”. We substitute this into the 
expression for cost, and we have the cost as a function of the side length: 
1792 

a 


c(x) = 7x7 + 


The domain of this function is (0, +00). We compute the derivative: 


c’(x) = 14x - bake § 
x 
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This is always defined on the domain of c. We have c’(x) = 0 when 142 = 14x, so x° = 42 = 128, and 
x = V128 = 42. Further, we see that c’(x) < 0 for x < 4V2, and c’(x) > 0 for x > 42. 


Thus, we know, by applying the result from Problem 4.14 and the First Derivative Test, that x = 4 V2 gives the 
global minimum of the function. We can also use the Second Derivative Test: 


nin 14 , 3584 
c (x) =14+ i 


Note c” > 0 for all x in the domain, so the function is concave up everywhere. Thus, any critical point must be a 
local minimum. 


Our conclusion is that the cheapest box has side length x = 4V2 and height h = a = 72. This box costs 


$7 (16 V4) + $8 (4 V2) (7 V2) = $336 V4 ~ $533.37. 


(I guess that the box must be made of a really rare material!) O 


Problem 4.16: A spring starts moving at time t = 0, and oscillates away from its equilibrium position; its 
position, as a function of time t (in seconds), is given by y = e~' cos t. Find the maximum and minimum values 


of its position. 


Solution for Problem 4.16: We're asked to find a global maximum and minimum. First, we note that since the 
values of t that we are considering all are nonnegative, the domain of the position function is [0, +00). So we'll 
need to check the critical point t = 0 (which is the endpoint of the domain interval), find all the local maxima and 
minima, and we’ll have to take into account what happens as t grows large. 


At t = 0, we have f(t) = e° cos(0) = 1. Also, since |cost| < 1, we have |f(t)| < e' < 1, and thus for all t > 0, we 
have f(t) < 1 = f(0). Hence, f(0) = 1 is the global maximum. 


To find the other critical points, we can start by computing the derivative: 
f(t) =e*(-sint) + (-e)(cos t) = —e'(sint + cos #). 


This derivative is always defined, so we don’t need to worry about critical points that occur because of an 
undefined derivative. We just need to find where f’(t) = 0. Since e* is always nonzero, we have that f’(t) = 0 if 
and only if 

sint + cost = 0. 


This simplifies to tant = —1. Sot € {32, mm, un, +6 ‘ are the critical points. 


To tell which are local maxima and which are local minima, we could try the Second Derivative Test. We 
compute: 
f(t) = -e*(cos t — sin t) + e“'(sint + cos t) = 2e“ sint. 


This is positive at an, ue, ..., and negative at mn, ion ... So the critical points alternate being local minima and 
local maxima. 
We can see that at each local minimum, we have cos(t) = -¥ so we have f(f) = ~¢+B, Thus, clearly f = 


gives the smallest of the local minima, and we have that 


(=) = er ~ 0.067 


is the global minimum. 
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v2 


Also, we see that at each local maximum, cos(t) = e, so f(t) = e*+= ata local maximum. The largest of these 


. on 7m oe} 
is when t = 4, giving 


77 Jp V2 
f(F) =e 4 iy = 0.003. 


Thus, again we see that f(0) = 1 is the global maximum. 0 


We can draw a picture of the spring’s movement (note the different scales for the x- and y-axes): 


¥ 
a 


We see the minimum is indeed at f = sa ~ 2.356. The cost term of the function provides the oscillation of the 


function between positive and negative values, whereas the e term contributes a “damping” effect, pushing the 
function close to 0 very quickly. This function is an example of damped oscillation, which is very common in 
physics; we will also see more of this in Chapter 9. 


The next problem is a classic optimization problem: 


Solution for Problem 4.17: The word “longest” in the problem statement is our cue that this is an optimization 
problem. We need to model this problem as a function that we can optimize. 


The length of a ladder that touches the corner can be expressed as a function of the 
angle of incidence with the corner, as in the picture to the right. In order to write this 
function, we'll need to label a few more lengths in the picture, which we have done 
in the diagram. Suppose the ladder makes angle @ will the corner, and that the corner 
divides the ladder into two pieces of lengths m and n, as shown in the diagram. We 
now see that 


Io 


sin@d=-, 


Zar 
2 


cos@ = —. 


So the length of the longest ladder that can touch the corner at an angle of @ is given 
by the function 
f(0) =m+n=asecO + bescO. 


Now we need to be careful and make sure that we know exactly what we want to do. Our ladder has to be 
able to rotate all the way around the corner, so its length can be no more than f(@) for all 6 € (0, rt), That is, we 
need a ladder that is at most as long as f(@) for all relevant values of 6. Thus, minimizing f(@) will give us the 
longest possible ladder that works for all relevant values of 0; that is, it will give us the longest ladder that can 
rotate around the corner. 
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Note that the effective domain of f(@) is 6 € (0, 7/2), which makes sense: setting 0 = 0 or 0 = 7/2 would allow 
an “infinite” ladder parallel to one of the corridors. Thus, we need to find the minimum of f(0) for 6 € (0, 7/2). 


Our next step is to compute the derivative, so that we can find the critical points: 
f'(0) = asec 0 tan 6 — besc 8 cot 0. 
This derivative is defined on all of (0, 77/2), so we just find where it is equal to 0: 
0 = asec 0 tan 0 — besc 0 cot 0. 
Thus asec 9 tan 6 = bcsc @ cot 8. This expression is easier to work with in terms of sin and cos: 


a sin@ _, cos@ 
cos?@ sin? 9 


Thus 


and hence @ = tan! ( t) 


Plugging this angle into f will give the maximum allowed length, which (for the record) is 


olor (>= 


We can remove the trig functions from this answer by using the facts that for x € (0, 7/2) we have 


x2 
sec(tan™! x) = Vx2 +1 and ese(tan™! x) = St. 


Thus we can rewrite our length as 


(b/ay9) 
Careful algebra can show that this equals (a3 + b3)?. o 


EXERCISES 

4.2.1 Find the maxima and minima of the following functions on the specified domains: 

(a) 2x*-x+30n [0,2] 

(b) x*/(x-1) on (1,3] 

(c) 42° —- 8x? +1 0n[-1,1] 

(d) sinx +x on [0,27] 

4.2.2 Find the value of k such that f(x) = x — : has a local maximum at x = —2. 

4.2.3. A group of people wish to charter a boat for an around-the-world voyage. The fee is $1000 per person for 
the first 100 people, but decreases by $5 (for everyone) for each person beyond 100. The charter company will 
have costs of $40000 + 200n, where n is the number of people on the trip. The boat’s capacity is 250. What number 


of people maximizes the company’s profit? 


105 


CHAPTER 4. APPLICATIONS OF THE DERIVATIVE 


4.2.4 Find a differentiable function (with domain R) that has 3 critical points, one of which is a local maximum, 
one of which is a local minimum, and one of which is neither. Hints: 143, 69 


4.2.5 We wish to make a soup can in the shape of a right circular cylinder (closed on both ends, of course) that 
will hold 200 cm? of soup. The aluminum to make the can costs 0.2 cents per square centimeter. What is the 
cheapest can that we can make? 


4.2.6 Cody has a pet goat named Baroness Regina von Pufflestein, and he wants to create a grazing area for her 
along the back of his farmhouse. He has 40 meters of fencing, and will form a rectangular grazing area where 
the fencing makes up 3 sides of the rectangle and the side of the house makes up the fourth side of the rectangle. 
What is the maximum area that he can construct? 


4.2.7* You want to look at a 10-foot-tall statue that is on the top of a 50-foot-tall building (so that the top of the 
statue is 60 feet above the ground). Assume that you are 5 feet tall. How far away from the building should you 
stand in order to get the best view, which in this case means so that the statue takes up the largest possible angle 
in your field of vision? Hints: 10, 168 


4.3 VELOCITY 


In our previous discussions, we've alluded to our primary interpretation of the derivative: as a rate of change. 
In particular, if f(t) is a function where the variable t represents time, then f’(f) represents the instantaneous rate 
of change of f(t). As a special case, if f(t) represents a position of an object at time t, then f’(t) represents the 
object’s velocity. Further, f’’(t) represents the rate of change of velocity, which we call acceleration. 


Here’s an example: 


Solution for Problem 4.18: The rate of change—that is, the derivative—of position is velocity, and the rate of change 
of velocity is acceleration. We can compute these functions. (For obvious reasons, we typically denote velocity at 
time t by v(t) and acceleration at time t by a(t).) 


s(t) = 26° — 2187 + 60t + 3, 


v(t) = s'(t) = 67 — 42t + 60, 
a(t) = v(t) = s(t) = 12t — 42. 


We can make a few observations. For instance, the particle starts at t = 0 at position s(0) = 3 with initial velocity 
v(0) = 60 and initial acceleration a(0) = —42. In words, the particle is moving to the right at 60 units/second, but is 
decelerating at a rate of 42 units/sec/sec. 


We also note the inflection point where a(t) = 0 at t = 7/2. This is the time at which the particle switches from 
accelerating to the left and starts accelerating to the right. 


We can also analyze the velocity. Specifically, we can factor v(t) = 6(t — 2)(t — 5). So the velocity is negative for 
2 < t <5and positive otherwise. This means that the particle is moving to the left between ft = 2 and t = 5, and 
moving to the right otherwise. 
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Using this information, we can now sketch a graph of the particle’s movement, 
shown at right. Note that motion upwards on this graph is motion where s is in- 
creasing, so the particle is moving to the right; conversely downward motion on the 
graph corresponds to the particle moving to the left. Between the dashed lines, where 
2 <t <5, the particle is moving to the left (that is, its position is decreasing). We also 
note that the “peak” position of the particle during the interval 0 < t < 5 comes at 
t = 2. This is where the particle stops moving to the right and starts moving back to 
the left. Note that v(2) = s’(2) = 0 at this time. 0 


One very common application is acceleration due to gravity. On Earth, the accel- 
eration due to the force of the planet’s gravity is a constant 32 ft/sec”. (This neglects things like air resistance 
and the fact that at very high altitudes the Earth’s gravity is marginally smaller, but we will for now ignore such 
details.) A free-falling body’s height (in feet) is given by the equation 


y(t) = -16#? + vot + yo, 


where t is in seconds, v9 is the initial velocity in feet/second, and yo is the initial height in feet. (In the next chapter, 
we will see how to derive this formula.) Note that y(0) = yo. Also, we can differentiate y(t) to get functions for 
velocity and acceleration: 


v(t) = y(t) = —32t + v9, 
a(t) = v'(t) = y(t) = -32. 


Notice how these functions make sense: in particular, the velocity starts at vp and decreases by 32 every second, 
and the acceleration is our constant —32. 


In metric, the acceleration due to gravity is —9.8 m/sec”. The formula for the body’s height in meters is thus: 
y(t) = -4.9f7 + vot + Yo, 
giving 
v(t) = y(t) = —9.8t + vp, 
a(t) =v'(t) = y(t) = -9.8. 


Here is a typical problem involving gravity: 


Problem 4.19: A cannon shoots a cannonball off of the top of a 1000-foot high cliff with an initial upwards 
velocity of 200 feet per second. 


(a) What is the maximum height (above the ground) that the cannonball attains? 


(b) How fast is the cannonball traveling downwards when it hits the ground (1000 feet below the top of the 
cliff)? 


Solution for Problem 4.19: We have y(0) = 1000 and vp = 200, so the height of the cannonball is given by 
y(t) = —16f? + 200t + 1000. 


(a) We know that the maximum height is reached at the time when v(t) = y’(t) = 0. This should (by now) make 
intuitive sense to you. Moreover, this is the only critical point of the function other than its endpoints (at t = 0 
and at the time at which the cannonball hits the ground), and the heights of the cannonball at the endpoints 
are lower than the height at the critical point. 


We know that the velocity is v(t) = —32t + 200, so the velocity is 0 at t = 200/32 seconds after the launch. 
Thus, the maximum height is 


y(200/32) = —16(200/32)* + 200(200/32) + 1000 = 1625 feet. 
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(b) The cannonball hits the ground when y(t) = 0, so we solve 
0 = -16f? + 200f + 1000. 


Using the quadratic formula, we find 


, = 7200+ (40000 + 4(16)(1000) 25 +5 ¥65 


132 


Since we only want to consider f > 0, the cannonball hits the ground (25 + 5 65) /4 = 16.328 seconds after 
launch. At this time, it is traveling downwards at a velocity of 


02 | _ _32(25 * v65) +200 = —40 V65 ~ —322.49 


feet per second. (This is about 220 miles per hour). 


O 


EXERCISES 


4.3.1 A rock is thrown straight down from the top of a 480-foot high building, with an initial downward velocity 
of 16 feet/second. When does it hit the ground, and at what speed is it traveling when it hits the ground? 


4.3.2 Acannonball is shot at a 30 degree angle from the ground with an initial velocity of 30 meters/second. How 
far from the cannon is the cannonball when it hits the ground? 


4.3.3 A certain road has speed limit L(x) (in miles per hour), where x is the distance from the start of the road 
in miles. Cars A and B start at mile 0 and are at position a(t) and b(t) at time t¢ (in hours); in particular note that 
a(0) = b(0) = 0. 


(a) Write an inequality that states the fact that car A is always going as fast or faster than car B. 
(b) Write an equation that states the fact that car B is always going exactly at the speed limit. 
(c) Say (in words) what the equation a(t + 1) = b(t) means (for all t > 0). 

(d) Say (in words) what the equation a’(t) = L(a(t — 1)) means (for all t > 1). 


4.4 TANGENT LINE APPROXIMATION 


Another application of the derivative comes directly from its definition as the slope of the tangent line to a 
point on a graph. Suppose f is a function that is continuous and differentiable at x = a. We can approximate f 
near a by its tangent line at (a, f(a)). The advantage of this is that the tangent line is a line, and lines are really easy 
to work with. 


Let’s see a simple example. 


Problem 4.20: Let f(x) = /x. Using f and f’, approximate ¥8.1 by constructing the tangent line to f at (8,2) 


and finding the point on this line with x = 8.1. 


Solution for Problem 4.20: We start by sketching f(x) = /x and drawing the tangent line at the point (8, 2): 
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On the right above, we have zoomed in the graph near the point (8,2). We see that the graph looks really 
similar to its tangent line near the point (8,2). We can use this information to estimate f(8.1). The tangent line to 
f at (8,2) has slope f’(8), so we need to compute f?: 


1 


fi) = Za = Sx = 


So f"(8) = 78 = 
of this line is 

ye x-8 

Are Rivage 


soy == Boe 4d Plugging in x = 8.1 gives us y= 4 +2= 4. 


12 


Using a calculator, we see that 241 ~ 2.00833 and V8.1 ~ 2.00830, so our approximation is good to 4 decimal 
places (in this example). In Chapter 7, we'll see how to estimate the error more generally. 


Let’s try to generalize our example from Problem 4.20. 


Solution for Problem 4.21: The tangent line through the point (a, f(a)) has slope f’(a). This gives us the point-slope 
form 


y — fa) = f@x-a), 
or y = f(a) + f’(a)(x—a). 0 
We can use the equation from Problem 4.21 as a method for approximating functions near the point (a, f(a)). 


Important: The ings line si ial grhaa ofa function f near a is 


f(x) = f(a) + f@x- a). 


We can see what's going on by looking at the diagram totheright. We Y 
see that the point on the tangent line has y-coordinate f(a) + f’(a)(x — a), 
and this is a close approximation to the actual value f(x). 


(x, f(x)) 


The general idea is that we can use a tangent line approximation 
in situations where f(a) and f’(a) are easy to compute but f(x) may be 
difficult to compute. 


(x, f(a) + f’(a)(x — a)) 
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We can use tangent line approximation to compute the approximate value of a 
function at a value that is hard to compute (such as V8.1 in Problem 4.20), but 


that is near a value that is easy to compute (such as V8 in Problem 4.20). 


Intuitively, the closer x is to a, the more accurate the approximation 


F(x) = f@ + f'@x-4) 


will be. We will discuss this more precisely in Chapter 7. 

We often write the above formula using differential notation, as 
follows. We are trying to approximate f near the point (a, f(a)). 
If we look at the tangent line at (a, f(a)), we see that the ratio of 
the change in y (denoted Ay) to the change in x (denoted Ax) is 
approximately the slope of the tangent line, which is f’(a). This 
gives us Ay/Ax ~ f’(a), or Ay ~ f’(a)Ax. (a + Ax, f(a + Ax)) 


We can also do this computation algebraically. We start with our 
tangent line approximation 


F(x) = f(a) + f@@ — a). (a, f(a)) 
Subtracting f(a) from both sides gives 

F(x) — f(a) = f'(a)(x — a). 
Note that f(x) — f(a) is the change in the y-coordinate on the graph 


of the function as we move from (a, f(a)) to (x, f(x)), so we have 
Ay = f(x) — f(a). Similarly, Ax = x — a. Thus: 


Important: —_ Near the point (a, f(a)), we have Ay ~ f’(a)Ax. 


Thus, once again, the derivative represents the rate of change of y with respect to x. 


Important: This is probably the most important concept of calculus: the derivative f’ 
represents the rate of change of f. 


Replacing a function with its tangent line approximation at a point x = a is also called the local linearization 
of the function at x = a (or simply at a). 


Concept: Near x = a, a differentiable function and its local linearization at a are almost the | 


The above statement is pretty vague, with the mathematically imprecise words “near” and “almost.” We can 
more precisely state what we mean by the local linearization being “almost” the same as f(x) “near” x = a: 
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Problem 4.22: Let f be a differentiable function with f’ continuous. Define E(x) to be the error in the 
approximation of f(x) using f(a); that is 


E(x) = f(x) -(f@ + f’ (a(x -a)). 


©) 


Prove that i = 0. What (in words) does this mean? 


Solution for Problem 4.22: We simply compute: 
— 


= lim 
xa X— aA x—a =a 
a = lim I=L a mrt 
= f'(a-f (a) = = 


In words, this means that E(x) approaches 0 faster than x — a does when x — a. So as x gets near a, the error 
in the linear approximation approaches 0 faster than x approaches a. This is the formal explanation of what we 
mean when we say that the linear approximation is “close” to f(x) “near” x =a. 0 


You can explore this idea further in Challenge Problem 4.51. 


Solution for Problem 4.23: We first compute f’(x) = 2e*. So f’(0) = 2e° = 2, and thus the tangent line at x = 0 has 
slope 2. Since f(0) = 1, the tangent line passes through (0,1). Therefore the tangent line is y = 2x + 1, and this is 
the local linearization of e”* at x = 0. 


To approximate e%? = f(.001), we plug in x = .001 to the local linearization: 
f(-001) ~ 2(.001) + 1 = 1.002. 
The actual value of e°? to 6 decimal places is 1.002002, so this is a very close approximation. 0 


Once nice advantage of the Ay ~ f’(x)Ax perspective is that sometimes we’re only interested in the value of 
the change away from a point. Specifically, one application of tangent line approximation—also called linear 
approximation—is to estimate the error of an observable quantity when we know that there is an error in our 
measurement. Let’s look at a couple of examples. 


Problem 4.24: Paul is trying to measure the height of a building using a sextant. He stands 100 meters away 
and determines that the angle from the ground where he is standing to the top of the building is 30°, so he 


computes that the height of the building is Joys meters. But, if his sextant is only accurate to within 1 degree, 
then (approximately) by how much might his measurement of the height of the building be off? 


Solution for Problem 4.24: We start by determining the function that converts Paul’s sextant measurement to a 
height measurement. But be careful! Don’t do this: 


Bogus Solution: The height of the building is given by f(x) = 100 tan(x). 


This won't work, because the angle given by the sextant is in degrees! In calculus, we always need to have our 
angles in radians, or else we won't get the correct derivatives. So we must convert the angle to radians first, and 


111 


CHAPTER 4. APPLICATIONS OF THE DERIVATIVE 


we get the height function 


f(x) = 100tan(="), 


where x is the angle in degrees. 


To compute how much our height might be off, we use our Ay ~ f’(x)Ax formula. We are told that |Ax| < 1. 
We now need to compute the derivative of f: 


br) ee (=) 
fos aa (5)= 9 °° \ 780) 


Therefore, since we have x = 30, we get 


, _ oR 2m _ Sn 4_ 200 
Dg ee Fag or 
Thus, using |Ax| < 1 we have 
ous 


|Ayl ~ |f’(30)Ax| < f’(30) = 
So we estimate that Paul’s measurement may be off by approximately 20 meters, which is about 2.327 meters. 0 


Problem 4.25: A distant planet (assumed to be a perfect sphere) is measured, using a telescope, to have a 
diameter of 2000 kilometers. However, it is known that the telescope will introduce a measurement error of 


within +1%. What is the estimated percentage error in the computed volume of the planet? — He 


Solution for Problem 4.25: We can write the volume as a function of the diameter: 


3 
4 (d 
V==n7/-/ = 
3"(3) 
Note that we use V for volume and d for diameter, for the obvious reason that’s it’s easier to remember what the 
variables stand for. 


Concept: Don’t be too attached to x and y. Use variable names that relate to the quantities 


that they represent. 


The error in the computation of the volume can be approximated by 
AV = V'Ad. 


Note that V’ = 32d? = 3d’, so when d = 2000 we have V’ = 4(2000)* = 20000007. We are also given that 
|Ad| < 0.01d = 20, so 
|AV| = — < (20000007z)(20) = (4- 10”)7. 


Also we know that V = 24° = 7(2000)° = $- 10°. So 


|AV| _ (4:107)n 3 wa 
“V (4/3)-109x 100 ~ 


So the error in the volume measurement is approximately +3%. 


In fact, we didn’t need to plug in d. We could have left the expressions in terms of d and simplified algebraically, 
as follows: 
IAV| _ [V’Ad| _ $@7|Ad| _ 3lAdl 


V vo OB Te 
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Thus, whatever the relative error in the measurement of d, we see that the relative error in the measurement of V 
is approximately 3 times as great. Specifically, in our problem, since we are given that Indl = 1%, we conclude that 


AVI ~~ 3(1%) = 3%. O 


Once again, we will explore how to compute the error in a tangent line approximation in Chapter 7, but there’s 
one thing that we can usually easily determine: is the approximation given by the tangent line too small or too 
large? To do this, we can look at the second derivative. 


Solution for Problem 4.26: If f’’(x) > 0 near x = a, then the curve is concave up, meaning y 
that the tangent line lies below the curve, as shown in the picture to the right. So the 
y-coordinate of a point on the tangent line (with x-coordinate near a) is going be less than 

the y-coordinate of the corresponding point on the graph of f. Therefore, the tangent 

line approximation is going to be too low. 0 


Conversely, if f’”’(x) < 0 near x = a, then the curve is concave down, meaning that the * 
tangent line lies above the curve. So in this case, a tangent line approximation is going 
to be too high. 


EXERCISES 
4.4.1 Use tangent line approximation to approximate the following quantities: 
(a) 80 (b) cos62° (c) log, 17 


4.4.2. A cube is measured to have side 30 cm, but the measuring equipment (a cheap ruler) has a measurement 
error of 2mm. What is the approximate error in the measured volume of the cube? 


4.4.3 I want to paint a sphere (with diameter 1 meter) with a 2 mm-thick coating of paint. Approximately what 
volume of paint do I need? 


4.4.4 Wehavea right circular cylinder whose radius equals its height. I want to measure the height in such a way 
that the measured volume is estimated to be within 2% of the actual volume. How much error (as a percentage) 
can I tolerate in the measurement of the height? 


4.4.5x Let f(x) = x" where nis a positive integer. Use the Binomial Theorem to explain why f(x+e) ~ f(x)+ef’(x) 
for small values of e. 


4.5 Newrtron’s MetHop 


One profound application of tangent line approximation is called Newton’s Method for approximating a root 
of a function. This is a numerical method, meaning that its use gives us a numerical approximation of the root. 
One nice thing (as we'll see) about Newton’s Method is that using it repeatedly usually lets us get as close an 
approximation to a root as we want. 


Here’s the setup: we have a differentiable function f, and we want to find a root r (so that f(r) = 0). We start 
by making an initial guess ro. The idea is to guess a value that’s easy to calculate. Unless we are psychic, this 
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initial guess will probably be wrong. Then we think about how can we use tangent line approximation to refine 
our guess. 


Solution for Problem 4.27: Using the tangent line approximation at ro, we get 


F(x) © fro) + fro) — 10). 


Let’s draw a picture with our initial guess rp and the tangent line at (r0, f(ro)) displayed: 


(ro, f(ro)) 


We solve for the x-intercept of the tangent line: 


0 = f(ro) + f’(ro)(x — 70). 


This simplifies to 


fro) ns 
fo) 
Of course, unless we were really lucky, this value of x is not actually a root of f; it is only a root of the tangent 


line approximation. But we hope that this x is a better approximation of a root of f than our initial blind guess ro, 
because we believe that the tangent line is a good approximation to f. So let’s set 


so we have x = rp — 


with the idea that r; is a better guess for a root of f than ro was. 


Now, we repeat the process from Problem 4.27 again, starting at 1. 
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Referring to the picture above (which we’ve blown up a bit for clarity), we take the root of the first tangent line 
approximation, call that r;, and use that as our next “guess” of the root of the function. We repeat the process—we 
take the tangent line approximation to the function at r; and find the root of this new tangent line. This root, we 
hope, should be a lot closer to the actual root of f. 


Specifically, we let 
— f(r) 
g=T f’ (rn1)’ 


and now r2 should be an even better guess at the root than r; (and a lot better than our initial blind guess 19). 


More generally, we can repeat this for as long as we want. At each step, we take our guess r,_; from the 
previous step, and set 


Fi (Tn-1) 
f'(tn-1) 


for all positive integers n. As we perform more and more steps, we should get closer and closer to an actual root. 


Tn = Tn-1 — 


Let’s see a basic example. 


Solution for Problem 4.28: We might first think of simply using a tangent line approximation. However, trying to 
use a tangent line approximation using the function x3 is not going to work very well, because 23 is pretty far 
from any known values of x} that we can use as the point at which we construct our tangent line. 


For example, we could try a tangent line ee a} kins Phe. = x5 at the point (32,2). We have 
Ax = 32-23 = -9 and f’(x) = 5x“. So f’(32) = § - 32-4) = Thus our approximation is 
1 151 
(23) x 2- 9(=) = = = 1.8875. 


This is not horrible, but because Ax was so large, it doesn’t match the actual value of 23 very closely. (We do 
know that our estimate is too high, since f’”’ (x) = — (4) x-3 <0,so f is concave down.) 
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Let’s see how Newton’s Method gives a better answer. Instead of working with f(x) = x3, we note that ¥23 is 
a root of g(x) = x° — 23. So we try to estimate the numerical value of the root of g using Newton’s Method. First, 
note that ¢’(x) = 5x‘. 


We start with an initial guess of ro = 2, because the fifth root of 23 is reasonably close to the fifth root of 32, 
which is 2. Then the first step of Newton’s Method gives 
8(2) 9 


2:— <= = 1.8875: 


ANG) BO 


Note that this is the same answer as the tangent line approximation gave us. This is not a coincidence: the first step 
of Newton’s Method is just using a tangent line approximation to guess at the root. The advantage of Newton’s 
Method, though, is that we can refine our initial estimate by applying tangent line approximation again and again. 


In general, we have 
F (tn-1) =e t. — 23 
f@a) "St, 


Mm = Tn-1 — 


So we can continue: 

(1.8875) 
g’ (1.8875) 
0.95713 
#181) ~ 3 46r60 
= 1.8875 — 0.01508 

= 1.87242. 


ro = 1.8875 — 


Now our estimate is 1.87242. 


We could do as many more iterations as we like; let’s do one more: 


ty, 0 gt 72K) 
ail (1.87242) 
0.01528 
= LOLI? ~ CF Angee 
~ 1.87242 — 0.00025 
~ 1.87217. 


Now our estimate is 1.87217. 


You can put 23” into your calculator, and you'll see that this is now accurate to 5 decimal places. 0 


A more typical example of Newton’s Method might be to approximate a solution to an equation: 


Solution for Problem 4.29: What we're really doing is trying to find a root of f(x) = x — cosx. This function has 
derivative f’(x) = 1+sinx. 


Part of the skill in applying Newton’s Method is making a decent initial guess. In this problem, rp = 0 seems 
like a simple place to start. We then compute: 


FO) _ _cos0 


> HO) Teeind 
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The next step is then 


f() 1-cos1 
el 2) = l=, 
f’() 1+sin1 
We can approximate these trig functions to 3 decimal places using a calculator: 
0.460 
t2 = 1-73 © 0-750 
eamenweny f (0.750) 0.750 0.750 
i .750 — cos 0. 
a= 0.700 — Sgepy Oa a ee SD 
Again, we approximate this to 3 decimal places: 
0.018 
r3 ~ 0.750 — 1.682 = 0.739. 
Let’s do it one more time: f(0.739) — — 
I .739 — cos 0. 
=i7n-— << - 
ta = O72 ~ OTE) Oo eS 
Computing again to 3 decimal places: 
0.000 
r4 = 0.739 — Lea ™ 0.739. 


The “0.000” in the numerator of the above fraction indicates that we are at a point in the process where our answer 
will not change in the first 3 decimal places, so we can stop now. 


Indeed, we check using a calculator that cos 0.739 = 0.739142477 ..., and these quantities agree in the first three 
decimal places. 0 


If we wanted more accuracy, we could continue to run Newton’s Method, keeping track of more decimal 
places. 


Newton's Method seems like a good method for finding a root. It is simple enough that you can write a very 
short computer program to execute it. But it’s not foolproof, and it’s worth thinking about situations where it 
might fail to accurately approximate a root. Obviously it will fail if there is no root to be found. For example, try 
Newton’s Method on the function f(x) = x* + 1 and see what happens! It might also converge to the wrong root 
if you choose a bad starting point, since many functions have more than one root. Also, we get stuck if we end 
up at a point with f’ = 0, because then the tangent line is horizontal, so the formula for Newton’s Method would 
require us to divide by 0. 


EXERCISES 


4.5.1 Use Newton’s Method to estimate ¥2. (Whatever starting point you pick, do two iterations, then check 
against a calculator to see how close you are.) 


4.5.2x A classic example in which Newton’s Method fails is f(x) = 4/x where we start too far away from 0. 
Compute 3 steps of Newton’s Method for this function using the initial guess r9 = 1. Can you explain what is 
happening? Hints: 210 


4.6 RELATED RATES 


We have already mentioned several times that the most important interpretation of the derivative of a function 
is as the rate of change of that function. We can take this a step further. Suppose we have two (or more) functions 
that are somehow dependent on each other. We can use their derivatives to compare their rates of change. 
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The term related rates refers to two quantities that are dependent on each other and that are changing over 
time. We can use the dependent relationship between the quantities to determine a relationship between their 
rates of change. 


Here is a basic example: 


Solution for Problem 4.30: Let r denote the radius of the balloon (in cm) and V be its volume (in cm*). This already 
is a little bit deceptive, since both of the quantities are changing over time. So what we should really do is let r(t) 
denote the radius and V(t) denote the volume, where r and V are functions of t, and where t denotes the time (in 
seconds). The relationship between the quantities is 


V(t) = Snir). 


What we wantis an expression that relates the rates of change of these quantities. For that, we take the derivative 
of the entire equation with respect to t; that is, we take 


d d (4 
V(t) = = (Sn(rioy). 


The left side of this equation is just S. On the right side, we must use the Chain Rule: 


a dr 


5 (sn') = 5 = 1 ((r(t))?) = = An(r(t) 


dv 
Thus ae = an(r(oyra . Now we can plug in our given data. The rate of change of the volume is given as a 


a; V 
constant 2 cm?/sec, so 77 = 2. Also, at the time t we are interested in, we are given r(t) = 3. So we plug these in: 


dr 
Rb 2 
2 = 4n(3) a 


= 7 roa So the radius is increasing by; = cm/sec. O 


This is the basic premise behind all related rates problems. We have two quantities, both of which are a function 
of the same independent variable (usually time). If there is an equation relating the two quantities, then we can 
take the implicitly differentiate that equation with respect to the independent variable to get an equation relating 
the variables and their derivatives. 


Thus, we solve to Beto 


In related rates problems, it is customary not to write explicitly the (#) part of the notation. For example, 
in Problem 4.30, we would typically write the equation relating the quantities as simply V = $7r°, omitting the 
mention of the variable t. Then, the derivative of this equation is V’ = 4nr*r’. However, it’s important to remember 
that in this expression, V and r (and 7’) are quantities that depend on t. 


Problem 4.31: An observer is 1 kilometer away from a rocket launch pad. At time t = 0, the rocket lifts off 
straight upwards, and at time t seconds has achieved an altitude of 10f meters. At 5 seconds after takeoff, 


how fast is the distance between the rocket and the observer increasing? 


Solution for Problem 4.31: Even though there is only one moving object—the rocket—there are two quantities that 
we are interested in that are changing with respect to time: the altitude of the rocket (which we'll call h) and the 
distance from the rocket to the observer (which we'll call x). 
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The picture at right shows how these quantities are related. But don’t make the Rocket 
following mistake: 


Bogus Solution: 10¢? m 


Paid. 


We have to be more careful about the units! The observer is 1 kilometer from the launch Qpgerver 
pad, but the height of the rocket is given in meters. So, if we want to express everything 
in meters, we have the equation 


x* = h* + (1000). 


Taking the derivative of both sides with respect to t, we get 


We are given t = 5. But t doesn’t directly appear as a term in the above expression. So we need to figure out what 
the quantities in the above equation are in terms of t. 


When t = 5, we have h = 10#2 = 250. This means that x? = (250)? + (1000)? = 17(250)?, so x = 250 V17. Also, 


- = 20t = 100. Thus, we have 
dx dh 
2x = 2h 
> = = st 
Hw 


=> (250 vine = (250)(100) 


dx _ 100 _ 100Vi7 


dt V7 17 
100 V7 


So the distance between the rocket and the observer is increasing at a rate of ->5~ m/sec. 0 


The next problem is a classic related rates problem: 


Problem 4.32: A 10-foot-tall ladder rests against a wall, but the foot of the ladder is slipping away from the 
wall at a rate of 1 in/sec. When the ladder forms a 60° degree angle to the ground, how fast is the top of the 
ladder sliding down? 
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Solution for Problem 4.32: We should set this up to have as our two “variables” the functions whose rates we care 
about. Therefore, we should use w (the distance from the wall) and h (the height of the top of the ladder). We can 
then write an equation relating these quantities: 


w* + h* = 100. 


But this uses feet as the units on the right side. Since the rate is in in/sec, maybe we should write the whole thing 
in inches: 
w? +h? = 14400. 


Although it is always good practice to keep our units consistent, it doesn’t really matter here, because the next 
step is to take the derivative with respect to time: 


2ww’ + 2hh’ = 0. 


Now we need to plug in the values of these quantities so that we can solve for h’. We know that w’ = 1(in/sec). 
We need to figure out w and h. But the other bit of information that we haven’t used yet is the 60° angle. This 


gives us w = 5 (feet) and h = 5 V3 (feet). But the rate of change is in inches per second! So we need to use 
w = 60 (inches) and h = 60 V3 (inches). We then plug these values in to our related rates expression: 


2(60)(1) + 2(60 V3)h’ = 0. 
Solving for h’ gives h’ = -+- So the top of the ladder is falling at a rate of a inches per second. 0 


By the way, in Problem 4.32, what does the function h’(t) look like in general? We can solve for h’ in our related 


rates equation: 
h =-w' =-< 
a hh’ 
since w’ = 1. But we know what the quantity w/h is: it’s the cotangent of the angle that the ladder makes with the 
ground. So we have 


h’ = -—cot@, 
where @ is the angle that the ladder makes with the ground. 


This makes “real world” sense. At 0 = 7/2, the ladder is vertical, and cot(7/2) = 0, meaning that the top of the 
ladder is stationary. As 0 decreases, cot @ increases, so the top of the ladder moves faster and faster downwards. 


Let’s look at one more related rates problem that’s a bit harder. 


Problem 4.33: A cone-shaped filter has a hole at the top of radius 4 cm and a hole at the 
bottom of radius 1 cm, and is 6 cm in height. Water flows out of the bottom at a rate of 
2 cm/sec. If the filter begins completely filled at time t = 0, how fast is the water level 
decreasing after 30 seconds? 


Solution for Problem 4.33: We start by identifying the relevant quantities that are changing with respect to time. 
These are the height of the water, which we’ll call h, and the volume of water in the filter, which we'll call V. We 
need to determine how they are related. 
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We see that the filter is a frustum, which is a cone with a smaller cone chopped off. The 
volume of a frustum is most easily computed as the difference in volume between the two 
cones. To compute the volumes of the cones, we need their heights. So we will be much 
better off if our height is measured from the vertex of the imaginary cones, not from the 
bottom of the filter. 


Next, the formula for the volume of a cone is }7r*h, where r is the radius and h is the 
height. So we need to determine how the height is related the radius. They are directly 
proportional, as we can see from our picture at right. In particular, decreasing the height 
by 6 causes a decrease of 3 in the radius, so we conclude that h/r = 2, and thus the radius 
at height h is h/2. Note that the top of our frustum is at h = 8 (not h = 6), and at the top we have radius 8/2 = 4, as 
expected. The bottom of the frustum is at h = 2-1 = 2. 


So the volume of the water at height h is: 


aii Sole A yond dl bee Find pang atin, only Ig 
V=gnrh 371) @)=3n(5) & 3 (1)°(2) = 757th rus 


We can differentiate this equation to get an equation relating the rates: 
V= qr, 


We are told that V’ = —2, as the rate of decrease of volume is constant. But we also need to find the value of h in 
order to solve for h’. So we need to find the volume at t = 30 and compute its height. 


We can start by finding the initial volume (at t = 0): 


4 (8)? - = 


VO) = 35 3 


This simplifies to 4277. 


When t = 30, we will have lost 2 - 30 = 60 cm? of water, so the volume will be V(30) = 427 — 60. We can plug 
this in to our equation for volume to find h at time t = 30: 


no yah gt) ue 
42n -— 60 = ri 3% 


This gives h? = sen and thus h = j= ~ 6.564. 


Finally, we can go back to our related rates equation: 
—2= pm, 
and plug in the found value of h. This gives: 


cael : i (ele omy y 
We solve for h’ to get our answer: 
, 8 2 
~ (5120-7208 Yn(64n—90)8 


This is approximately —0.0591, and thus at time ft = 30 the water level is decreasing at a rate of 0.0591 cm/sec. O 
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EXERCISES 


4.6.1 A plane is flying overhead at an altitude of 10 km and a speed of 200 m/sec. The plane is 20 km away from 
a camera on the ground, and the plane is flying in a direction that will take it directly over the camera. The camera 
continuously rotates to keep the plane centered in its lens. When the plane is 15 km away from the camera, how 
fast is the camera rotating? 


4.6.2 Boyle’s Law states (assuming that the temperature is constant) that the pressure and volume of a gas are 
inversely proportional. Suppose Gas X starts at 1 atm (atmosphere) of pressure and takes up 200 cm?. If the 
pressure is increased by 0.1 atm/min, then after 10 minutes, at what rate is the volume decreasing? Hints: 64 


4.6.3. A spherical snowball is 10 cm in diameter, but is melting at the rate of 0.5 cm* per second. When the 
snowball is reduced to half its original volume, at what rate is its diameter decreasing? 


4.6.4 A meteorite has entered the earth’s atmosphere and is burning up at a rate that is proportional to the 
meteorite’s surface area. What can you determine about the rate that the meteorite’s radius is decreasing? 


4.6.5 Two cars start at an intersection at time t = 0 with velocity 0. At t = 0, one car accelerates due north at 8 
m/sec”, and the other car accelerates due east at 5 m/sec*. After 6 seconds, by what rate is the distance between 
the cars increasing? 


4.6.6x The clock on the wall at my office has an hour hand that’s 6 cm long and a minute hand that’s 10 cm long. 
At exactly 2:00, at what rate is the distance between the tips of the two hands changing? Hints: 148, 67 


REVIEW PROBLEMS 


4.34 The following is the graph of the derivative f’ for some function f. Sketch possible graphs of f and f’”. 
Take into account where the function is increasing, decreasing, its concavity, and any inflection points. 


4.35 Suppose that f is a function with domain R such that f’(x) = x* - 3x +3 for all x € IR. Prove that f has an 
inverse. 


4.36 
(a) Show that if f is a quadratic polynomial, then the graph of f has no inflection points. 
(b) Show that if f is a cubic polynomial, then the graph of f has exactly one inflection point. 


(c) Can you generalize? 
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4.37 Describe the graph of f(x) = x° + px + q where p and q are constants. Determine where the function is 
increasing or decreasing, where any local maxima or minima are located, and the concavity and inflection points 
of the graph. 


4.38 Letm,n > 2be integers. Describe (in as much detail as possible) the graph of f(x) = x""(1—x)” on the interval 
[0,1]. Hints: 122 


4.39 A factory is making blivets. The factory costs $2500 per day to run, and each blivet costs $900 to make. Tina, 
the factory owner, knows that if she prices the blivets at $(2500 — x), then she will sell x of them per day. What 
should she do to maximize her profit? 


4.40 We wish to construct a fenced garden with the shape of a circular sector, with 
area 300 m*. What radius r and angle 0 should we choose to minimize the amount 
of fencing required (shown in dark black in the picture at right)? (Note that we could 
have @ < 7, contrary to the diagram; the only restriction is 0 < @ < 27.) 4 


4.41 Acceleration due to gravity is actually slightly lower at higher altitudes. In 
particular, g = cM where G is a constant (called the gravitational constant), M is the 
mass of the Earth, and r is the distance from the center of the Earth. If the radius of 
the Earth is 6400 km (approximately), estimate the percentage decrease in gravity on 
an airplane cruising at an altitude of 10 km above the surface of the Earth. 


4.42 The Ferris wheel at the County Fair is 60 meters in diameter and rotates at a rate of 4 revolutions per minute. 
If Natalie is sitting in one of the cars, at what rate is she gaining altitude when she is passing (on the way up) 
through a height of 40 meters? 


4.43 (Use a calculator for this one.) The temperature of a pie after it is removed from the oven is 70 + 230e~" 
degrees Fahrenheit, where t is the time in minutes and k is some constant. After 10 minutes, the pie has cooled to 
200 degrees. 


(a) When will the pie be 150 degrees? 
(b) At the time in part (a), how fast is the pie cooling? 
4.44 


(a) Suppose that f and g are directly proportional (so that f(t) = kg(t) for some constant k, for all t). What can 
you say about the relationship between the rates of change of f and g? 


(b) Suppose that f and g are inversely proportional (so that f(t)g(t) = k for some constant k, for all t). What can 
you say about the relationship between the rates of change of f and g? 


CHALLENGE PROBLEMS 


4.45 Explain and prove the Racetrack Theorem: if f and g are differentiable functions, with f(a) = g(a) for some 
aand f’(x) > 9’(x) > 0 for all x > a, then f(x) > 9(x) for all x > a. 


4.46 What is the relationship between the critical points of f and the critical points of f?? 


4.47 Determine the real number a having the property that f(a) = a is a relative minimum of 
f(x) =A - 2 -2? +ax4+1. 


(Source: HMMT) Hints: 247 
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4.48 


(a) Sam wishes to cross a circular lake with diameter 1 km. He can row across the water at a rate of 4 km/hour, 
or he can walk along the shore (carrying the rowboat) at a rate of 6 km/hour. What is the minimum amount 


of time necessary to cross the lake? Hints: 71, 119, 218 


(b) Now Sam wishes to cross a river of width 1 km, and reach a point on the opposite bank that is 1 km 
downstream. Again, he can row across the water at a rate of 4 km/hour (ignore the effect of the river’s 
current), or he walk along the shore (carrying the rowboat) at a rate of 6 km/hour. What is the minimum 


amount of time necessary to reach his destination? Hints: 76, 54, 224 


4.49 The Rule of 72 states that if an amount of money is invested at r% interest per year, then it will take 
approximately 72/r years for the money to double. Use linear approximation to explain why this rule is a good 


approximation for small values of r. Hints: 219 


4.50 Snell’s Law describes how light passes through two mediums in 4 


which the speed of light is different. Suppose that light travels from the 
point A in the positive half-plane (where y > 0) to the point B in the 
negative half-plane (where y < 0). Further suppose that light has speed 
c; in the positive half-plane and speed c; in the negative half-plane. Find 
an equation that relates the angles that the light ray hits the x-axis with 
the speeds of light in the two half-planes, given that light will always 
travel such that its total travel time is minimized. Hints: 141, 47 


4.51x Suppose that a function f is twice-differentiable. Recall from 
Problem 4.22 that we defined E(x) to be the error in the approximation 
of f(x) using f(a); that is 


E(x) = f(x) -f@+f'@e-4)). 


“eB 


Speed of light c. 


Show that for any z > a, we have |E(z)| < |M|(z — a)”, where M is the maximum value of f’’(x) on the interval [a, z]. 
(In fact, later in the book, we will show that |E(z)| < 5|M|(z — a)’, but this is difficult with our current technology.) 


Hints: 279, 260 


4.52x A snowplow can remove snow at a constant rate (in ft®/min). One day, there was no snow on the ground 
at sunrise, but sometime in the morning it began snowing at a steady rate. At noon, the plow began to remove 
snow. It had cleared 2 miles of snow between noon and 1 PM, and 1 more mile of snow between 1 PM and 2 PM. 


At what time did it start snowing? Hints: 12, 189 
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INTEGRATION 


Back in Chapter 3, we mentioned two fundamental geometric questions that motivate calculus: 


Question 1: Given a real number a in the domain of a function f, what 
is the slope of the line that is tangent to the graph y = f(x) at the point 
(a, f(a))?  £@) 


R 


Question 2: Given real numbers a < b such that the interval [a, b] is a subset 
of the domain of a function f, what is the area of the region bounded by 
the graph y = f(x), the lines x = a and x = b, and the x-axis? 
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We developed the derivative to answer Question 1. In this chapter, we will develop the definite integral to 
answer Question 2. It turns out that the definite integral is intimately related to derivatives—this is so profound 
and so central a fact that it is called the Fundamental Theorem of Calculus. 


5.1 AREA UNDER A CURVE 


You already have a pretty good idea what we mean when we discuss the area of a geometric figure. The 
concept of “area under a curve” is not really that different. 


For simplicity, we'll first look just at positive-valued continuous functions: 


Definition: Let [a,b] be a closed interval, and suppose that f is continuous on [a,b] such that f(x) 2 0 for all 
| x € [a,b]. The area under the curve y = f(x) (or, more briefly, the area under f) on the interval [a, b] is the area 
| of the geometric figure bounded by: 


e the graph y = f(x) between (a, f(a) and (b, f(b)), 
e the line segment x = a between (a,0) and (a, f(a), 
e the line segment x = b between (b,0) and (b, f(b)), 
e and the x-axis between (a,0) and (b,0). 


This is a rather cumbersome definition, but the picture to the right above makes clear what’s going on. We’re 
just taking the graph of y = f(x) on the domain a < x < b, and extending vertical lines from the two endpoints 
down to the x-axis. We then take the area of the shaded region as shown in the picture. 


If f is sufficiently simple, then finding the area under f is not much of a problem. 


Problem 5.1: Let f(x) = 3. Find the area under f on the interval [—2, 5]. 


Solution for Problem 5.1: The graph of y = 3 is just a straight horizontal line. So the region in question is just a 
rectangle with height 3 and length 5 — (—2) = 7. Thus the area is 3-7 = 21. 0 


Problem 5.2: Let f(x) = 3x + 2. Find the area under f on the interval (0, 4]. 


Solution for Problem 5.2: We draw y = 3x + 2 between x = 0 and x = 4, and the resulting 
region under the curve is a trapezoid with vertices (0,0), (0,2), (4,14), and (4,0). The bases 
have lengths f(0) = 2 and f(4) = 14, and the distance between the bases is 4. Thus the area 
is 4(2 + 14)/2 = 32.0 
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Problem 5.3: Let f(x) = |x|. Find the area under f on the interval [-—3, 1]. 


Solution for Problem 5.3: When we draw the graph of y = |x|, we see that the region y 
under the graph between —3 and 1 consists of two triangles. The triangle to the left 
of the y-axis has base 3 and height 3, so it has area 3. The triangle to the right of the 
y-axis has base 1 and height 1, so it has area }. Thus the total area is 3 + } =5. 0 


Problem 5.4: Let f(x) = V4—x?. 
(a) Find the area under f on the interval [-2, 2]. 
(b) Find the area under f on the interval [0, 1]. 


Solution for Problem 5.4: At first, this looks like a difficult function to deal with. But if we write it as y= V4 — x? 
and square both sides, we get y” = 4 - x’, so x? + y* = 4. In other words, the graph is the top half of a circle 
centered at the origin with radius 2. 


(a) The area under f on the interval [—2,2] is just half of the area of a circle of radius 2. Thus the area is 
(1/2)(47) = 2n. 

(b) The picture of the region whose area we wish to compute is shown at right. If we 
draw the segment from (0,0) to (1, V3), we see that this region is a right triangle 


with sides 1 and V3 together with + Of a circle of radius 2. Thus the area is 


342 
> +3: 


We don’t want to restrict ourselves just to positive-valued functions. We want to define area “under” the curve 
of functions that take on negative values as well. This is really area “over” such a curve, but we want this area to 
count “negatively,” so we make the following definition: 


Definition: Let [a,b] be a closed interval, and suppose that f is continuous on [a, b] such that f(x) < 0 for all 
x € [a,b]. The area under the curve y = f(x) (or, more briefly, the area under f) on the interval [a,b] is minus 
the area of the geometric figure bounded by: 

e the graph y = f(x) between (a, f(a)) and (b, f(b), 

e the line segment x = a between (a, 0) and (a, f(a)), 


e the line segment x = b between (b, 0) and (b, f(b), 
e and the x-axis between (a, 0) and (b,0). 


This is exactly the same definition except for the word “minus.” So, for example, the area under the function 
f(x) = —2 on the interval [1,4] is —(2 - 3) = —6. Even though this is really area “over” the curve, the convention is 
to call it area “under” the curve and to count any area below the x-axis as “negative” area. 
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This hopefully makes it clear what to do with an arbitrary continuous function that takes on some positive 
values and some negative values. We consider the part of the region that’s above the x-axis as positive area, and 
we consider the part of the region that’s below the x-axis as negative area. We’ll define “area under a curve” more 
formally in a moment, but first let’s just try to get a general understanding with a quick problem. 


Problem 5.5: Let f(x) = 2x — 5. Compute the area under f on the interval [0,6]. 


Solution for Problem 5.5: We draw y = 2x — 5 for 0 < x < 6, and we see that it is negative for Y 
x < 3 and positive for x > 3. The portion on the interval [0,3] contributes a negative area of 
(4)(3)(-5) = —2, and the portion on the interval [3, 6] contributes a positive area of (})(3)(7) = 2. 


So the total area under the curve is 2 - 2 =6. 0 


When computing area under a curve y = f(x), the portion of the curve where 
f(x) > 0 contributes positive area, and the portion of the curve where f(x) < 0 


contributes negative area. 


All of the problems that we’ve solved so far were computing areas of regions that were triangles or trapezoids 
or circles or other simple geometric objects. But what about a region under an arbitrary curve? Also, what we’ve 
done so far in “defining” area under a curve is not very rigorous. For one thing, we never really defined what 
“area” means—rather, we relied on some geometric intuition about area (and most likely when you first learned 
about area in geometry you didn’t really do it rigorously then either). 


We need to begin somewhere, and one shape that we can hopefully all agree on is a rectangle: if a rectangle 
has length / and width w, then it has area lw. We'll use rectangles to help us define areas for all other regions by 
approximating, in a systematic way, the area of a region under an arbitrary graph as a sum of areas of rectangles. 
We'll start with an example: 


Problem 5.6: We wish to compute the area under the curve f(x) = x* on the interval [1,2]. 


(a) Draw a rectangle with opposite corners (1, f(1)) and (2,0). What is its area? How does its area compare 
to the area under f on [1,2]? 


(b) Draw 2 rectangles: one with opposite corners (1, f(1)) and (1.5,0), and one with opposite corners 
(1.5, f(1.5)) and (2,0). What is the total area of the two rectangles? How is this total area related to 
the area under f on [1,2]? How does this area compare to the area from part (a)? 


(c) For any positive integer n, draw n rectangles: the i* rectangle (for 0 < i < n) has opposite corners 
(1 + £,,f(1 + 4)) and (1 + #4,0). Whats the total area of the 1 rectangles (in terms of n)? How is this total 
area related to the area under of curve? 

In (c), what happens as n increases? What happens as n grows large? 
Using (c) and (d), determine the area under f on [1,2]. 


Solution for Problem 5.6: The region in question is shown in the diagram at right. Unfortunately, Y 
we don’t know geometrically how to compute the area of a region bordered by a parabola. 


Sidenote: Actually, the ancient Greek mathematician Archimedes did know how to compute 


S the area under a parabola! He used triangles instead of rectangles, but his method 
was not all that different from what we are about to do below. 
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However, we do know how to compute the area of simple shapes, and the simplest shape to compute the area 
of is a rectangle. We will use rectangles to approximate the area under the parabola. (Note: we will distort the 


horizontal scale of the pictures that follow, so that what we are doing is a bit more clear.) 


(a) Asa very crude estimate, we can draw a rectangle with opposite corners at (1,1) 
and (2,0). This rectangle has width 1 and height 1, so the rectangle has area 1. 
This is clearly a way-too-small subset of the area under f—in particular, the light- 
shaded area under the graph of f but above the rectangle is not being counted—but 
it does give a lower bound of 1 for the area under f. 


(b) To get a somewhat better estimate, we can draw two rectangles, each of width 0.5 
and each with the upper-left corner touching the graph, as shown to the right. The 
first rectangle has opposite corners at (1, 1) and (1.5, 0), and has area 0.5. The second 
has opposite corners at (1.5,2.25) and (2,0), and has area 1.125. The total area of 
the rectangles—the dark region in the diagram at right—is thus 0.5+ 1.125 = 1.625. 
That’s a better estimate, but we can still clearly see from the picture that it’s too 
small: the light-grey shaded area is still missing from our total. 


(c) Wecan do the same thing with an arbitrary number of rectangles. For example, to 
the right we show 7 rectangles. We can see from this picture that the sum of the 
areas of the rectangles—the entire region in dark grey to the right—is now a fairly 
close approximation to the entire area under the curve; only the light grey regions 
are missing. 


To be more precise, let n be any positive integer, and we'll draw n erlang ies, each 
with width 4. Thatis, each rectangle will have vertical sides atx = 1+ 4 and x = 1+41 


for some integer 0 <i<vn. The height of the i* rectangle is f(1 + Hie = (1+ 4). So 
the total area is 


We can simplify this sum: 
n-1 ia n-1 
1 See eh 2 
Lai (1+-) a pees sy (n? + 2ni + 7°). 


n-1 
Now, we can use the formulas f= 


i=0 i= 


a= 1) tg _ n(n-1)(2n-1) 
a 2! alka Y Dagaes 
in terms of n: 


n(n — 1)(2n - ») _ on = 9+ bn 
ae) Se 


= a + 2ni+i*) = (0° +n?(n—1)+ 
(d) The expression that we got in part (c) for the area of n rectangles, 


il ig. 
3 2n  6n2’ 


to get a closed-form expression 
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is a lower bound for the area under f on [1,2] for all n. As n gets larger, this bound appears to get more and 
more accurate, since the rectangles more closely approximate the entire region under f. 


(e) As n gets arbitrarily large, the -= 


guess that the area under f(x) = x? on [1,2] is 3 


1 : ‘ 
and em terms in our rectangles-area expression approach 0. Thus, we 
n 


In fact, this “guess” is correct, as we will soon see. 


We'll define area under a curve in much the same way as we did in our example in Problem 5.6, by subdividing 
the area into rectangles. However, in Problem 5.6, we took care to make sure that each xectanigie had the same 
width: if we used n rectangles, we constructed them so that each rectangle would have width ~. In general, we 
don’t need to be quite so picky. So let’s first define a word for dividing up an interval [a,b] into a bunch of smaller 
subintervals that we will use as the widths of our rectangles. 


Definition: A partition P of a closed interval [a,b] (into n parts) is a sequence 


P= Xo Xi Xo Sot Np oy = Oe 


In other words, we’ve divided up the interval [a,b] into n smaller intervals: 
[a, b] = [xo, x1] U [x1, x2] U [x2,%3] U-+* U [Xn-1, Xn]. 


Each of the intervals will serve as the base of one of our rectangles. But what should be the height? 


We could let the left side of each rectangle touch the curve, as we did for Y 
our parabola in Problem 5.6. But for a more general function, this may produce 
rectangles both above and below the curve, as in the figure to the right. This is 
slightly unpleasant, because then how do we know whether the sum of the areas 
of the rectangles is too big or too small an approximation for the area under the 
curve? 


Instead, there are two more manageable choices for the height of each rect- 
angle. We can take a rectangle whose height is the smallest possible, as in the a b X« 
picture to the left below. Or we can take a rectangle whose height is the largest possible, as in the picture to the 
right below. Clearly, the sum of the areas of the rectangles in the left picture is too small compared to the area 
under the curve, whereas the sum of the areas of the rectangles in the right picture is too large. 


As the widths of the rectangles get narrower and narrower, the two areas should converge to equal the area 
under the curve. (Hopefully.) 
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Let’s try to make this rigorous. For our partition P, let h; be the minimum value of f(x) on the i" piece of the 
partition; that is, for all0 <i<n, 
h; = inf{ f(x) | x € [xi, Xi+1]}. 
Note that since f is continuous, it must have a minimum value on any closed interval, so these values h; will exist. 
These will be the heights of our smaller rectangles. 


Note that these heights don’t have to be at one end or the other of the rectan-  Y 
gle; they might be in the middle. For example, if we took a different partition of 
this function, we would get the rectangles on the right. In particular, note that 
the middle rectangle has a height “in the middle,” not at either endpoint. 


Similarly, we let H; be the maximum value of f(x) on the ith piece of the 
partition; that is, 
H; = sup{ f(x) | x € [xi, xi+1]} 
for all 0 <i <n. These will be the heights of our larger rectangles. . b 


® 


Using our smaller heights, we get the lower area for a partition P for the region under the curve f: 
n-1 
Uf, P) =) hiss - xi). 
i=0 


Our reasoning above tells us that this area should be less than (or perhaps equal to) the area under the curve. Of 
course, | depends both on our function f and our choice of partition P of [a,b]. This sum I(f,?) is sometimes more 
technically referred to as a lower Darboux sum. 


Similarly, using our larger heights for P, we get an upper area for the region under the curve f: 


n-1 
u(f,P) = VY Hilxies — Xi). 
i=0 
This area should be greater than (or equal to) the area under the curve. u(f,P?) is sometimes called an upper 
Darboux sum. 


Thus, for any choice of partition P we get a lower estimate I(f, P) of the area and an upper estimate u(f,P) of 
the area; that is, we expect that 
I(f, P) < Area < u(f,P) 


for all partitions P. By picking very fine partitions—that is, partitions that give us very narrow rectangles—both 
of these estimates should be very close to the actual area. In fact, we'll use a limit-like argument to define area this 
way. 


Definition: Let f be a continuous function defined on a closed interval [a,b]. Let 


Lf) = sup{l(f,P) | P is a partition of [a, b]} 


and 
ub(f) = inf{u(f,P) | P is a partition of [a, b]}. 


(This assumes that these quantities are defined.) L®(f) is called the lower Darboux integral of f over [a,b], 
and U!(f) is called the upper Darboux integral of f over [a,b]. If L2(f) = U2(f), then we define 


[r-t=uto, 


and call this the definite integral of f over [a,b]. 
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Let’s step back and see the big picture of what we’re doing here. We’re dividing [a,b] up into little pieces, 
and approximating the area under f using rectangles with our little pieces as the widths of the rectangles. We’re 
doing two different estimates: one with rectangles that are deliberately chosen to be too small, and the other with 
rectangles that are deliberately chosen to be too big. The former will give us a lower bound on the area under f, 
and the latter will give us an upper bound. As our partition of rectangles fits the curve more and more closely, the 
lower bound will rise and the upper bound will fall. If they meet, then the value at which they meet is the area 
under the curve. 


The symbol 7 is called an integral sign, and is supposed to remind you of an elongated letter “S” (as in “sum”), 


because the definite integral is arrived at via a process involving sums of area of rectangles. We read f f as “the 
integral of f froma to b.” The function f is called the integrand of the integral, and the numbers a and bare called 
the limits of integration. 


There’s another peculiar feature of the notation, which is best illustrated by an example. If f(x) = x”, then we 


could write 
2 2 2 
[f= [ perar= fear 
1 1 1 


for the area under f on [1,2]. In particular, if we write the integral in terms of a dummy variable (like x) in the 
function expression, then we write dx as a “term” of the integral. This looks a little silly right now, but as we will 
see soon there are very good reasons for doing so. 


Even though our definition is fairly abstract, we can use it to deduce some basic properties and calculate some 
simple definite integrals. Here are some basic properties about Darboux sums and definite integrals: 


Important: _—_ Let f be continuous on [a,b]. 
e If P is any partition of [a,b], then l(f,P) < u(f,P). 
e If P and Q are any two partitions of [a,b], then I(f, P) < u(f,Q). 
© Li(f) < Up(f). 
e For any functions f and g, 


fura- [refs 


provided the integrals on the right side are defined. 


e For any functions f and any a <c <b, 


[r= fr+ fr 


provided the integrals on either side are defined. 


Hopefully, you find all of these properties to make sense. We will leave the proof of the first property as an 
exercise. The proofs of the remainder of the above properties are not difficult but are very technical, so we will 
omit them in this book; you will likely revisit them if you study real analysis later in your mathematical career. 


As a quick check that our definition is reasonable, let’s check it on a simple region for which we already know 
the area: 
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Problem 5.7: Let c be a constant and let f(x) = c. Compute ie i 2 


Solution for Problem 5.7: No matter what partition P we choose, we have that h; = H; = c for all sections of the 


partition. Therefore, every Darboux sum is equal to c(b — a), and thus f ‘ f =c(b—a). This is consistent with our 
intuitive notion of area: the region is just a rectangle with length (b — a) and height c. 0 


Problem 5.8: Compute ff x° dx. 


Solution for Problem 5.8: We’ll use a method similar to that of Problem 5.6. 


Let P,, be the partition consisting of the intervals [o, 1), (2. 2),...,)%4, 1]. Since f(x) = x? isa strictly increasing 


function, we have : ; 
eae" 5 Ss a f(itt)_ @D 
m= f(=)= 5 i Hi= f( n )= n3 


for all 0 < i < n. This allows us to compute the Darboux sums for this function and these P,,: 


n-1 “3 n-1 
fa 1 : 
Uf, Pn) = ),-3(=)= a c 
i=0 i=0 
— (+1) /1\ 10 
W(f,Pr) =), (=)= = i+ 0. 
i=0 i=0 


We now use the formula 


and thus we have 


Grete tod Fra Sstrag Hy 
MUZE actagband Oecd ieee eal! 

n(n+1 1 #1 ae oe: 1 
wtf Pa) = Terre gh hea i (1+ 5) 


We notice that } is the lowest upper bound for all of these /(f,P,,) and that } is the greatest lower bound for all 
of these u(f,?,,); in other words, 


1 = supll(f,P,) |m 2 1} = influ(f,Py) |m 2 1). 


We only examined a specific type of partition: those with n intervals of equal width. However, we need to consider 
all partitions of [0,1] to compute the definite integral. Note that 


LA(f) = sup{l(f,P) | P is a partition of (0, 1]} > sup{I(f,P,) |n > 1} = : 


and 
Ul(f) = inf{u(f,P) | P is a partition of [0,1]} < inf{u(f,P,) | n > 1) = ; 


Thus, we have : 
Us(f) < 4 < (f)- 
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But we always have Li(f) < U}(f) (this is one of the facts in the box on page 132). Therefore, we must have 
1 
Li(f) = UN(f) = 4, and hence f, f=ho 


It turns out that we did not need to be so cautious in our definition of the definite integral: 


Important: For every continuous function f defined on a closed interval [a,b], the definite 


integral [fis define. 


This is a deep result and requires a substantial amount of technical 5-e-style computations to prove. We 
will omit the proof in this book, but many advanced calculus or real analysis books (such as [Sp]) have the 
proof. However, it is well within our grasp to prove that the definite integral exists for a monotonic continuous 
function—we will leave this proof to you as a Challenge Problem. 


Also, we can use the same definition to write the definite integral of many noncontinuous functions, although 
in this case there is no guarantee that the definite integral will exist. There is an example of a definite integral of a 
noncontinuous function in the exercises. 


Although Darboux sums are very useful for defining integrals, there is a more general construction that is 
broadly applicable to a wide range of problems. Given a function f continuous on [a,b] and a partition P of [a,b], 
we can construct a Riemann sum for f over P, by choosing any point in each piece of the partition to give us the 
height of the rectangle. 


Specifically, let P be the partition a = xg < x; < x2 < +++ < xX, =b, and let w; € [x;,xi41] be a point in the (i + i 
piece of the partition. Then the Riemann sum associated with this partition and choice of points is 


n-1 
rf, P,w;) = Viera — xi) f (wi). 
i=0 


Again, this is just a sum of areas of rectangles: (xj: — xj) is the width and f(w;) is the height of each rectangle. 
Note that choosing w; such that f(w;) is minimal gives us our lower Darboux sum I(f,?), and choosing w; such 
that f(w;) is maximal gives us our upper Darboux sum u(f,?). Indeed, this shows that 


If, P) < rf, P, wi) < u(f,P) 
for any choice of w;. 


If we let Ax denote the largest width in our partition, we have 


n-1 
[se dx = Bs, ) (an — xi) f (wi). 


What this means is that as we let the widths of the rectangles of the Riemann sum get smaller and smaller, then the 


Riemann sum approaches the definite integral ic f. However, the above expression is a bit vague: what exactly 
does “the limit as Ax 0” mean? It takes a bit of work to make this concept rigorous, so for this reason, we prefer 


Darboux sums for our rigorous definition of f f. However, the concept of Riemann sums is more general and 
turns out to be useful in applications, as we will soon see. 


Concept: | Whenever we can express a quantity, such as area under a graph, as the “limit” 
of a Riemann sum as the widths of the pieces of the partition get smaller and 


smaller, then the quantity can be represented by a definite integral. 
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Some common choices for w; € [%;, Xj+1] are: 


© w; = x;, giving a left Riemann sum, 
© W; = Xix1, giving a right Riemann sum, 


© Wj = (xX; + Xi41)/2, giving a midpoint Riemann sum. 


There is one more definition that is very convenient to have: 


We think of As f as “the integral of f from b to a,” which gives the area under the graph of f on [a,b] in the 
opposite direction, and thus gives —1 times the usual area. 


EXERCISES 
5.1.1 Compute the following: 


2 
(a) f 4dx (b) f (3x — 1) dx (c) { (x? + 1) dx 
ee 2 1 
2 
(d) { x? dx (e) fs (2x — 1) dx (f) f Lx] dx 
nai =3 0 


5.1.2 If f is a continuous function and a € Dom(f), then what is iy v2 
5.1.3 Prove, for any function f and partition P of [a,b], that I(f,P) < u(f,P). 


5.2 THe FUNDAMENTAL THEOREM OF CALCULUS 


It would be quite a hassle if every time we wished to compute a definite integral we had to set up Darboux 
sums or Riemann sums and compute areas of rectangles. Fortunately, there is a relatively easy way to compute 
definite integrals. There is a fundamental theorem of calculus—in fact, we call it the Fundamental Theorem of 
Calculus—that relates definite integrals to derivatives in an amazing way. The way we approach this is via the 
Mean Value Theorem. 


Let f be a continuous function on [a,b], and let F be a function whose derivative is f, so that F’ = f. Sucha 


function is called an antiderivative of f. We’re interested in computing t f, so we let P be a partition of [a, b], 
given by 
=X < Xf << Sx HE 


For each interval [x;, xj+1] in the partition, the Mean Value Theorem, applied to the function F, tells us that there is 
some value z; € [x;, Xj+1] such that 


f(z) = F’(z) =" P(xis1) = F(xi) 


Xi+1 — Xi 


When we multiply both sides by the denominator, we get 
F (Zi) (i+ — Xi) = F(Xi41) — F(x). 
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When we sum this over all the intervals, we get 


n n-1 


=1 
Y fev - x) = )° EG) - FR). (*) 
=0 


i; i=0 
Let’s look at the two sides of equation (+) separately. 


On the left side of (+), we have a Riemann sum. In particular, going back to the notation in the previous section, 
we note that h; < f(z;) < H; for all 0 <i <n, and thus the left side of (+) lies between I(f, P) and u(f, P): 


n-1 n-1 n=l 

If, P) = Yi deeies —im):S ica —xi)< Vein — xj) = u(f,P). (**) 
i=0 i=0 i=0 

On the right side of (*), we have a telescoping sum, which means that most of the terms cancel out: 


n-1 
YFG) — F(xi)) = (FQ@%1) — F(xo)) + (F(x2) — Foi) + (F(x3) — F(z) + +++ + (FQn) — Fn-1) 
0 = -F(xo) + F(&n) 

= F(b) -— F(a). 


Thus, the right side of (*) is just F(b) — F(a). Substituting this expression into the middle of (++) gives 
I(f,P) Ss F(b) aps F(a) < u(f,P?) 


for all partitions P of [a,b]. Thus, F(b) — F(a) is an upper bound for the set of all /(f,?) and is a lower bound for 
the set of all u(f,P), and hence 

L’(f) < F(b) — F(a) < U2 (f). 
But if the definite integral iC f is defined, then L2(f) = f : f = U®(f). Thus, there is exactly one value that can 


be placed in the middle of the above inequality for all P, and that value—by definition—is 7 f. Therefore, we 
conclude that 


[509 dx = F(b) — F(a), 


provided that the definite integral is defined. As we mentioned in the last section, if f is continuous on [a,b], then 
the definite integral f 7 f is defined, so we have: 


Important: The Fundamental Theorem of Calculus, Part I: If f is a continuous function on 
[a, b] with antiderivative F, then 


- Fx) dx = Fb) - F(a). 


Note that we have assumed that f has an antiderivative F. It turns out that every continuous function has an 
antiderivative: this is Part II of the Fundamental Theorem of Calculus, which we will see in a little bit. 


Let’s go back to our example from Problem 5.8, and we'll see how much the Fundamental Theorem of Calculus 
make our lives easier: 


Problem 5.9: Compute 4 x dx. = 
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Solution for Problem 5.9: In order to apply the Fundamental Theorem of Calculus, we need to determine an 
antiderivative of f(x) = x*. Happily, that’s easy: we let F(x) = }x*, so that F’(x) = x° = f(x). The Fundamental 
Theorem then tells us immediately that 


is 


‘ 1 1 1 
x dx = F(1) - F(0) = =(1)* — =(0)* = = 
i EL a BOT EY | 


More commonly, the computation from Problem 5.9 is written like this: 


1 
. 
[ 8a= 9 


The vertical bar means to take the function (=) evaluated at the top number (x = 1) minus the function evaluated 
at the bottom number (x = 0). If there is any question about what variable we are using to evaluate the function, 
we can list the variable explicitly, such as 
y x 
: x a= 7 


However, we will usually omit the variable if there is no ambiguity. 


eae | 
>» 4 4 


1 
x=0 


Let’s look at another basic example of computing a definite integral. 


Solution for Problem 5.10: An antiderivative of sin x is — cos x, because ot cos x) = —(—sinx) = sinx. Thus, via 
the Fundamental Theorem, we have: 


TE us 
{ sinxdx = (—cosx)| . 
0 0 


Therefore, the integral is equal to (— cos 7) — (— cos 0) = —(—1) — (-1) = 2. 0 


There is another version of the Fundamental Theorem of Calculus that is also very useful. Let’s see how this 
works with an example. 


We notice that there are two variables in our equation. In particular, the variable x is one of our limits of 
integration, so we need a different variable (in this example, f) to represent the variable in the function that we are 
integrating. 


Solution for Problem 5.11: We proceed as usual: since sin is an antiderivative of cos, we can simply evaluate the 
integral as 
x 


gtx) = [costar = sint 
0 t=0 


= sinx — sin0 = sinx. 
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Evaluation of the integral eliminated the dummy variable t, and what remains is a function of x. 0 


More generally, we start with a continuous function f and use the definite integral to define a new function, 
by selecting a point xp in the domain of f and defining 


sx) = [fiat 


Note that we use a new dummy variable ¢ in our integral. If F is an antiderivative of f, then the Fundamental 
Theorem gives us: 


x 
gx) = [flat = Fla) - FG) 
Bt) 
We now differentiate this with respect to x. Note that since F(x) is a constant, we have 4F (xo) = 0, and thus we 
get 
(x) = L(x) - F(a) = fa) 
g(x) = SF) - SF) = f(x). 
Putting this all together gives us: 


Important: |The Fundamental Theorem of Calculus, Part II: Let f be a continuous function 
Vv and xo a number in the domain of f. Define a function g by 


g(x) = [fea 
xo 


for all x such that [xo,x] (for x = Xo) or [x,xo] (for x < Xo) is a subset of the 
domain of f. Then g’(x) = f(x) for all x where both functions are defined. 


This is sometimes also written as 


d x 
- f Fat = fe. 


Unfortunately, our argument above presupposes the existence of an antiderivative F of f. We want the 
Fundamental Theorem of Calculus to be stronger: we want it to imply the existence of an antiderivative. 


Problem 5.12: Let f be a continuous function and xp € Dom(f). We wish to prove Part II of the Fundamental 
Theorem of Calculus without presupposing the existence of an antiderivative of f. Define 


sisy= [float 
xo 
wherever the right side is defined. We will prove that 9’(x) = f(x), as follows: 
(a) Show that 9(x + h) — g(x) = it t f(t) dt for any h. 
(b) Show that é 


g(x +h) — g(x) _ 
Serer 


g(x) = lim f(a). 


Solution for Problem 5.12: 


The proof can be confusing, because of the different variables that are used. 
We are using the variable x as the independent variable of the function g, 
and the variable t as the dummy variable in the definite integrals. 


WARNING!! 
“S 
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+h +h 
ib fnar= [ fayaes [i f(t) dt, 
+h +h 
iL. fiat = feoat— fo flat = gtx +I) ~ gC. 


By the definition of limit, we must show that for any e > 0, there exists 6 > 0 such that 


We know that 


so rearranging gives 


0<|h| <6 i — f(x) <e. 


Since f is continuous, we may choose 6 such that 


It-x1<d => |f)-f@|<e => f(x)-e<f()<f(x)te. 


Thus, if |h| < 6, we have 
+h +h x+h 
[ (f(x) - €)dt < a f(t) dt < - (F(x) + €) dt. 


But the outer integrals in the above expression are definite integrals of constant functions (that is, they don’t 
depend on ft), so they can be explicitly evaluated, and we get 


+h 
h(f (x) -€) < if f(t)dt < h(f(x) + €). 


We now apply part (a) and divide by h, giving 


f(x) -e< sents) < f(x) +e. 
This means that 
0<fhi<56 = ee i <€, 
hence, by definition, 
g(a) = Him SE = 80 _ pep) 


as desired. 


A nice feature of Part II of the Fundamental Theorem is that we no longer have to presuppose that a continuous 


function f has an antiderivative F. The Fundamental Theorem not only tells us that F must exist, it even tells us 
how to construct it: 


F(x) = f f(t) dt. 
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Another use of this version of the Fundamental Theorem of Calculus is to rigorously define functions that would 
otherwise be hard to describe. For example, back in Chapter 1, we gave a vague definition of the exponential and 
natural logarithm functions. This definition forced us to assume many of the nice properties that these functions 
satisfy. However, using a definite integral, we can rigorously define the natural logarithm to be the function whose 
derivative is A by defining, for all x > 0, 

it 
logx = i — dt. 
1 £ 


Using this definition, we can now prove all of the features of the natural logarithm function and its inverse, the 
exponential function. We present the details in Section 5.A. 


Also, we often use definite integrals to define functions that couldn’t really be defined any other way. For 
example, the following function is quite important in statistics: 


erf(x) = wae e dt. 


This is called the error function. Note that, by definition, we have 


d ange 

—eri(x) = —=e". 
The constant a is there to make certain calculations in statistics come out nicely. 
EXERCISES 


5.2.1 Compute the following definite integrals: 


2 1 m/4 2 
(a) f x! dx (b) fp e* dx (c) { . sec* 0d@ (d) [ ; 2x dx 


5.2.2 Suppose f is continuous (with antiderivative F) and a < c < b. Explain why 


frfrfs 


5.2.3 Recall that a function f is called odd if f(—x) = —f(x) for all x. Show that, if f is continuous and odd, and 
[—a,a] is in the domain of f, then e f(x) dx = 0. Hints: 215 


—@ 


5.2.4 Suppose f is differentiable and g is continuous. Show that 


F(x) 
Ef sat=syrons'on, 


Hints: 281 


5.3. INTEGRATION METHODS 


In order to apply the Fundamental Theorem of Calculus to compute definite integrals of the form f : f dx, we 
need to be able to find an antiderivative of f; that is, we need a function F such that F’ = f. This process is called 
antidifferentiation (for obvious reasons), and while finding derivatives is pretty easy, finding antiderivatives can 
be much more difficult, as we will explore in this section. 
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5.3.1 Definition and basic examples 


Recall the definition: 


| Definition: Let f be a function. A function g is called an antiderivative of f if ¢ is differentiable and 9’ = f. 


As we saw when discussing the Fundamental Theorem of Calculus, the usual convention is that if a function is 
denoted by a lowercase letter, then its antiderivative is denoted by the same letter, but in uppercase. For example, 
F is an antiderivative of f, meaning that F’ = f. 


Notice that we said an antiderivative rather than the antiderivative. Did we need to be so careful? In other 
words, does a function have a unique antiderivative? 


Problem 5.13: Suppose that F; and F are antiderivatives of f. How are F; and F2 related? Must they be 
equal? 


Solution for Problem 5.13: It’s not so hard to come up with an example of two different functions that have the 
same derivative. For example, if f = 0 and g = 1 (the constant functions 0 and 1, respectively), then f’ = 9’ = 0. 
Furthermore, all constant functions have derivative 0, so any constant function is an antiderivative of the function 
f =0. Perhaps this generalizes somehow. 


Suppose F; and F are both antiderivatives of f; that is, F, = F, = f. Then 
(Fi — Fo)’ =F, -F, =f -f =0. 


So F; — F2 is a function whose derivative is 0 everywhere, and we know that the only such functions are constant 
functions. Thus F; — F2 = c for some constant c. O 


Important: _ If F; and F; are antiderivatives of the same function f, then F; — Fz = c for some 


constant c € R. 


So there is no “unique” antiderivative of a function, but the antiderivative is determined up to a constant term. 


Antiderivatives are also called indefinite integrals, and have a familiar notation: if F is an antiderivative of f, 


then we write 
F= [f= [ foya. 


Note that this is the same notation as we use for definite integrals, except that an indefinite integral does not have 
limits of integration. As with definite integrals, if we write a dummy variable (like x) in the function expression 
of the indefinite integral, then we write dx as a “term” of the integral. 


The use of an equal sign in the above “equation” is a little strange, as we know that if f is determined only up 
to a constant term. Instead, we write 
f f=F+GC, 


where C denotes the arbitrary constant term. (Traditionally, it’s almost always written with a capital “C.”) 
This notation also gives us a convenient shorthand that relates our two types of integrals via the Fundamental 


Theorem of Calculus: 
iv fo)de = { fo ax) 


b 
a 
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Pay close attention to the difference between the two types of integrals. The integral on the left, f f(x)dx, isa 
definite integral, and is equal to a number, specifically the area under the curve y = f(x) on the interval [a,b]. The 
integral on the right, f f(x) dx, is an indefinite integral, and is equal to a function, specifically the antiderivative 
of f. (Actually, since it’s only determined up to a constant, it’s a whole set of functions.) When we then take the 
difference of this antiderivative evaluated at x = b and x = a, we get the definite integral on the left side of the 
above equation. 


Since differentiation is linear—that is, (f + g)’ = f’ + g’ and (cf)’ = cf’—we expect that antidifferentiation is 
too. 


Problem 5.14: 
(a) Show that if F and G are antiderivatives of f and g, respectively, then F + G is an antiderivative of f + g. 


(b) Show that if F is an antiderivative of f and c € R is a constant, then cF is an antiderivative of cf. 


Solution for Problem 5.14: For both parts, we simply use the corresponding linearity properties of the derivative: 
(a) (F+G/ =P +@=frg. 

(b) (cFY = (F’) = cf. 

a) 


We can express the linearity of antidifferentiation in integral notation: 


Important: __ If f and g are continuous functions and c € R, then 


foro=frefs ma forme fy 
Miia G!! Statements “equating” indefinite integrals are only true up to a constant. 


By the Fundamental Theorem, these linearity properties extend to definite integrals too: 


Important: _ If f and g are continuous functions with [a,b] C (Dom(f) M Dom(g)), and c € R, 


then 
fura- [rr fog and foreef'r 


5.3.2 Antiderivatives of common functions 


We have a nice catalog of functions—polynomials, trig functions, exponentials, etc—that we know how to 
differentiate. We'll try to develop a similar catalog of functions that we can antidifferentiate. 


We'll start with the class of functions that we typically find easiest to deal with: polynomials. 


Problem 5.15: 
(a) Ifnis any positive integer, find f x ax. 


(b) Does your formula from part (a) work for n = 0? Why or why not? 
(c) Does your formula from part (a) work for negative integers? Why or why not? 
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Solution for Problem 5.15: 


(a) 


(b) 


(c) 


We know that when we take the derivative of a monomial, the exponent decreases by 1: 


d as —1 
ae = ny 


Therefore, when we take the antiderivative, which is the opposite of taking the derivative, we expect the 
exponent to increase by 1. We start with 


oye = (n+1)x". 


But we want something whose derivative is x”, not (n + 1)x", so we divide by n + 1: 


d ntl 
dxn+1_ 


Therefore, 


ntl 
fva- +c 
n+1 


Don’t forget the “+C”! Remember that antiderivatives are only determined up to a constant term. 


Our derivative formula (x”)’ = mx"! works for any integer m, whether positive, zero, or negative. Thus, 
everything that we did is valid for n = 0 too. Hence, 


[Paes [rde=x4c 


Why can’t we do the same thing? 


It’s correct that the statement 


a 
dx 


x1 = (n+ 1)x" 


is true for any integer n. However, it’s the last little step from our argument in part (a) that might break: 
dividing by n + 1 is only valid if n + —1, since we can’t divide by 0. 


Thus, the formula holds for every integer except —1. 
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WARNING!! It’s easy for calculus beginners to make computational mistakes while taking 
“S the derivatives and antiderivatives of polynomials. 


To take the derivative of x”, we decrease the exponent by 1, then multiply by 


the original exponent: 


d n= py 
me =m. 


To take the antiderivative of x", we increase the exponent by 1, then divide by 


the new exponent: 
ttl 
ftar=2 +c 
n+1 


Make sure that you see how these two operations are the reverse of each 
other. Eventually, these operations will be as automatic as addition and 
subtraction. 


1 
This naturally leads us to wonder: what is the antiderivative of f(x) = x! = me Fortunately, we know a 


function whose derivative is -. We saw this in Chapter 3: 
d 1 
Fy 198 x)= = 
But be careful! 


Bogus Solution: Therefore we conclude that 


wid 


J Fae =logx +c 


The problem is that : is defined for all nonzero x, whereas log x is defined only for positive x. But we can fix this! 


Notice that 
d 1 
gx OBC) = WE 


So for x < 0, the function log(—x) is an antiderivative of -. We can combine the x > 0 and x < 0 cases by using 
absolute value: 


1 
f Zar= login +c 


This now makes sense for all nonzero x. Note that we sometimes “put the dx in the numerator” and write this as 


dx 
i = log |x| +C. 


bie ing Don’t forget the absolute value when taking the antiderivative of ~. 


Combining our antiderivatives of monomials with our linearity rules from Problem 5.14, we can find the 
antiderivative of any polynomial. 
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Problem 5.16: Find the following antiderivatives: = 
(a) f (3x7 — 7x + 2) dx 


© S(a-3)# 


(c) f (5x + 2)? dx 


Solution for Problem 5.16: 


(a) Weuse our formula from Problem 5.15, and the fact that antiderivatives are linear: 


for- 7x + 2)dx = 3 f 2ar- 7 [ xax+2 [ rdx 
-3(3]- 7(5)+2x+¢ 


=_ = ae +2x+C. 
Once you’ve gotten the hang of this, there’s no need to write all the intermediate steps. Just do the whole 
thing in one step: 
fe? -7042)ax = 2 - Pe +2r+C. 


One thing that might worry you is the constant “+C” at the end. Specifically, since we broke up our integral 
into 3 smaller integrals, why don’t we have 3 constants? In other words, shouldn’t the answer be the 


following: 
fot-r+2dx=3 [ Pdr—7 f xar+2 [ras 


Sa ae lle 


= TP 420+ (CG +Q+O)? 


One way to answer this is that we just say C = C; + C2 + C3, so we don’t have to wonty about it. A more 
thoughtful answer is that any antiderivative of 3x* — 7x + 2 is going to differ from x° — 2x + 2x by a constant, 
and we label that single constant C. All of this is a result of the rather sloppy notation of writing that 
f f(x) dx = F(x) + C, when actually i f(x) dx signifies any function that is an antiderivative of f(x), not one 
particular function. But the “sloppy” notation is nearly universal, so we'll stick with it. 


(b) We can do the same sort of calculation with negative exponents, but extra caution is required as it is a lot 
easier to make computational mistakes when dealing with negative exponents: 


{(#-3) ) ax = 4f-2fF 


1 
sey cal ‘Pa 
2 1 
= Rant Ge tS 


Again, while you’re still getting accustomed to these calculations, be extra careful with the signs and the 
constants. You might find it easier to rewrite the original integral in terms of negative exponents before 
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proceeding with computing the antiderivatives: 


J (grog) a= fer aan 


Once you’ve become more comfortable with antidifferentiation, you'll be able to do the whole thing in one 
line without writing the intermediate steps: 


4 2 4 2 2 1 
— — =) dx = -——— + — +C=-—— + —5+C. 
f(G = *=~JOxt0 * 4x 5xl0 * Dt * 
(c) Just as with the corresponding derivative calculation, it’s tempting to take the following “shortcut”: 


Bogus Solution: 


: Gite ed 
if 5 —— 


We can see the error of our ways by reversing the process—that is, by taking the derivative. Of course, doing 
so requires using the Chain Rule. 


d (5x + 2)* = (Se 


es " |(Zex+2) = 5(5x + 2). 


So we’re off by a factor of 5. Indeed, what we’ve just shown is that 


(5x + 2)4 


CG, 
gree 


} 5(5x + 2)? dx = 


and therefore, dividing by 5 gives our answer: 


f (5x + 2)° dx - oe is a 


As you might suspect, this leads us to a Chain Rule for antiderivatives; we'll see this in the next subsection. 


QO 


Let’s continue building our catalog of antiderivatives of common functions. By now, we should be very 
comfortable using trigonometric and exponential functions, and we know their derivatives, so we should be able 
to find their antiderivatives too. 


Problem 5.17: 
(a) Find f sinxdx and fcosxdx. 
(b) Find f e*dx. 
Solution for Problem 5.17: 
(a) Since we know that 
=. d ; 
— sinx = cosx and — cosx = —sinx, 
dx dx 


we immediately see that 


f costed = sinx +c and [ sinxax = -cosx + 
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(b) Since e* is the derivative of e*, it is also an antiderivative of e*. In other words, 


fedmesc 


5.3.3 The Chain Rule 


With differentiation, we’ve got it pretty easy. Using our catalog of basic functions, such as polynomials, trig 
functions, and exponentials, together with some basic rules such as the Product and Quotient Rules, the Chain 
Rule, and the Inverse Function Rule, we can take the derivative of pretty much any “nice” function that we’re 
likely to encounter. 


With antidifferentiation, however, we’re going to have to work up a little bit of a sweat. Antidifferentiation is 
straightforward for really simple functions like polynomials, basic trig functions, and exponentials, but once we 
start getting away from the basics, we quickly find that antidifferentiation is a lot harder than differentiation. There 
are some general techniques that we can use, but as you'll discover as you work through the progressively-harder 
examples in this section, antidifferentiation is a bit of an art form, and requires lots of practice. 


Sidenote: Every January, the Massachusetts Institute of Technology holds its annual Integra- 
tion Bee, where MIT students compete against the clock and each other to quickly 


and accurately evaluate integrals. The winner is crowned Grand Integrator. 


Recall that, in Chapter 3, we learned the Chain Rule for taking derivatives. We can use the Chain Rule in 
reverse to compute many antiderivatives, such as the following: 


Problem 5.18: Compute {(12x)(2x? + 1) dx. 


Solution for Problem 5.18: We have 


£ ox +1)? = 3(2x7 +1)- So +1) = 3(2x7 + 1)?(4x) = (12x)(2x + 1)*. 


Thus f (12x)(2x? + 1)? dx = (2x7 +1)? +C. 0 


Our goal is to develop a systematic way to apply the Chain Rule in reverse. That is, if we start with a function 
like (12x)(2x* + 1)?, how do we “undo” the Chain Rule in order to compute f (12x)(2x* + 1)? dx? 


To see how we do this for antiderivatives, we go back and think about what the Chain Rule really means. One 
way that we wrote the Chain Rule was: 
df df du 
dx du dx’ 
For example, if f(x) = (2x? + 1)° as in Problem 5.18, then we can set u = 2x? + 1. We then write f(u) = u>, so that 


he 3u?. Also we have * = < ox +1) = 4x. When we put all this together, we get 


df df du 


toe Fy = SUP Ax = B(2x? + 1)? Ax = (12x)(2x? + 1), 
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“ay 


Of course, by now you've probably internalized this procedure so that you don’t explicitly write the “u” step—you 


just go straight to the result ¢, as we did above when we first presented this example. However, when we try to 
apply the Chain Rule “in reverse” to compute an antiderivative, as we're about to see, we'll usually be explicit 
about denoting what u is in our Chain Rule calculation. 


Let’s now look at our example from Problem 5.18: 
f (12x)(2x? + 1)? dx 


We will make a clever substitution so that we can use the Chain Rule in reverse. We substitute u = 2x? + 1. If we 
just did this alone, and nothing else, we’d have 


{ (12x)u? dx. 


That’s not too good—this integral is an ugly mix of x and u. So not only do we need to substitute u, we also need 
to substitute for the derivative of u. Specifically, we have 


dud 
Ae tae, 


f (12x)u? dx = f a2 (&) dx. 


We now have the terms from the Chain Rule explicitly shown in our integral, since 


4 03) = 3,2 (44 
el (=). 


faa dx = fs (=) dx = [ Zora: =2+C, 


where the final step is simply applying the Fundamental Theorem of Calculus. 


so our integral becomes 


so our integral is 


We have a notational shortcut that is almost universally used for this sort of Chain Rule calculation. Starting 


with - = 4x, we “multiply by dx” and get du = 4x dx. When we make this substitution in our integral, we’re left 


fs du. 


Note that du is notational shorthand for # dx. Now it is clear that the antiderivative is just u° + C. 


with simply 


But we don’t want our final answer in terms of u—we want it in terms of x, our original variable. When we 
undo the substitution, we have our final answer of (2x? + 1)? + C. 


More generally, we can use the Chain Rule for antiderivatives whenever we have an integral of the form 


i ae Flx)) f’(x) dx. 


We then make the substitutions u = f(x) and du = f’(x) dx to get a simpler integral of the form 


fi g(u) du. 


We take the antiderivative of this integral to get a function G(u) + C, and finally we rewrite it in terms of x to get 
our final answer G(f(x)) + C. Note that reversing all of these steps gives us our usual Chain Rule for derivatives. 
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Concept: The Chain Rule for antiderivatives—although a bit more cumbersome to use than 


O= == _ the Chain Rule for derivatives—is really nothing new! It’s just the Chain Rule for 
derivatives in reverse. 


The tricky part is recognizing when our function is of the form ¢(f(x))f’(x). This can be hard in practice and is 
part of the reason why computing antiderivatives is so much more difficult than computing derivatives. 


Let’s see the Chain Rule at work in a few problems. 


Solution for Problem 5.19: Usually when trying to apply the Chain Rule to an antiderivative, we look for the “ugly” 


expression in the integral. In this case, the ugly expression is ¥3x? — 1. So our guess is to try u = 3x? - 1. Note that 
we don’t set u to be the entire ugly expression; rather, we set u to be something that’s inside our ugly expression. 


Now we have to get lucky. We need to have du appear somewhere in the integral. In this problem, du = 6x dx, 
and sure enough, there’s an x term sitting in there, so we multiply and divide by 6 to get 


Hh = (62) V3x2 -1dx. 
Then, after making the substitutions u = 3x* — 1 and du = 6x dx, our integral becomes 


fz Vu du. 


Don’t overlook the factor of ? that we had to throw in there. We have du = 6x dx, but we only had xdx in the 
integral, so xdx = }du. 


An alternative method of making the substitution is to start with our original integral 
ip x V3x2 - 1dx 


and then make the substitution u = 3x? — 1, but then take our du = 6x dx expression and “solve for dx” to get 


du 
faved, 


fi Vudu. 


This method looks a bit uglier, because of the intermediate step of fi x yu du which is a nasty mix of x’s and u’s. 
But happily the x’s cancel, which we must have happen in order for this method to work. 


ax = a. We then substitute into the integral to get 


from which the x’s cancel and leave 


Using either method, we are left with an indefinite integral in terms of u that we can evaluate: 
1 1 1 1(u2 don. 5 
fg vid = z f widu = a()+¢- al +C. 
To finish, we put everything back in terms of x: 
a +C= 50" oy) HC. 


149 


CHAPTER 5. INTEGRATION 


As a check, you can take the derivative of this last expression and verify that it is equal to the integrand of the 
integral that we started with. 0 


The general strategy for using the Chain Rule is to look for an expression inside the 
integral to set equal to u. However, we must also be able to find the expression for 
du in the integral as well. After substituting for u, and replacing the corresponding 
dx expression with du, we must have an integral that is entirely in terms of u with 
no x terms remaining. 


WARNING! —s The most common mistake is not correctly converting from dx to du. We 
“S cannot just replace dx with du—we must solve for du in terms of dx, or vice 


versa, and substitute accordingly. 


Problem 5.20: Compute fa pe gas ot 


Solution for Problem 5.20: The denominator is the more complicated expression. So we'll try u = 2x” - 12x. This 
gives du = (4x — 12) dx, so the numerator (x — 3) dx is du/4. Therefore, the integral is: 


{a sae = fu 


7 log lu +C= 7 log [2x* — 12x] +C. 


This antiderivative is 


The next problem involves exponentials: 


Problem 5.21: 
(a) Compute f e* dx. 


(b) Compute i xe™ dx. 


(c) Compute a (e* + 1) dx. 


Solution for Problem 5.21: 


(a) Usually with exponentials, we'll try setting u to be the quantity in the exponent, because we know how to 
integrate e". So for part (a), we let u = 3x and du = 3dx. Then the integral becomes 


ff eac= [ Zetdu 


d 1 
The antiderivative is just $e“ + C = 4e** + C. We can immediately see that this is correct, since ae ae =e", 


Setting u = cx for some constant c is a simple application of the Chain Rule. This 
will produce du = c dx, so the effect of substituting u for cx is to divide the integral 
by c. This application of the Chain Rule is so common that you will probably end 
up internalizing it and not need to write du out explicitly. 
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(b) The x? in the exponent is hard to deal with, so we let u = x*. This makes du = 2x dx, and we're in good shape 
since we have an x term out front. We get 


oe 
fretar= [Se du = se" +C= 56" +C. 
(c) This is more tricky. We might try u = e* + 1, but then du = e* dx, and we don’t have an “extra” e* term to 
combine with dx and create du. So we can’t just apply the Chain Rule naively. However, our problems go 


away simply by expanding the cubic: 


fre sapar= fe +32 +36" + 1)de = set Set 3c +x $C. 


Concept: Don’t be afraid to do some algebra to make antiderivatives easier to evaluate. 


We have to be especially careful when we evaluate a definite integral using a substitution, as we’ll see in the 
next problem: 


Oo 


Problem 5.22: Compute if: xe™ dx. 
1 


Solution for Problem 5.22: To get the antiderivative, we use the substitution u = x*, so du = 2xdx. But this is where 
we have to be careful: 


This is not correct, because the new limits of integration need to be in terms of u, not x, since our integral is now 
in terms of u. We have two methods by which to proceed. 


Method 1: Find the antiderivative in terms of x. We compute the antiderivative of xe in terms of x in the usual 
Way: 


aS ie ree re. 
fret ar= 5 [etd = Fe 4c= eo" 4c 


We then use this antiderivative to compute the definite integral: 


“ge dee bel = hee 
[« ee a, 


Method 2: Change the limits of the definite integral to be in terms of u. We can change the interval on which we 
are computing the definite integral, from an interval on x to an interval on u. We see that when x = 1, we have 
u = x? = 1, and when x = 2, we have u = x* = 4. Thus, 


: 1 
{ we dr= [ fetdu 
1 1 2 


We now finish the computation doing everything in terms of u. This gives 
1 u - 1 u = 1 4 
{k du= 5e Fae e). oO 
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The second method in the above solution is usually easier, since we can then skip the step of converting the 
antiderivative back in terms of the original variable. 


The Chain Rule is also useful with trigonometric antiderivatives: 


Solution for Problem 5.23: 
(a) 


(b) 


(c) 


Once again, the simple substitution u = c@ is the key. In this case, we substitute u = 36 and du = 3d0, and 
we get: 


J cossedo = [5 cosudu = 5 sinu+C= 5 sin30+C. 


We can quickly check that 4 } sin 30 = cos 36, as required. 


We don’t know how to integrate sin®. But we do know how to integrate u°. So let’s try u = sin@. Then 
du = cos 0d@, and we have: 


f sin? ecos odo = [udu = Fut + C= Fsinto-+C. 


Why doesn’t this work: 


The problem is that we have cos? @ in our original integral, but du only has a single factor of cos 0. 
So instead we'll try some algebraic manipulation. We can write 


sin? 0 cos* @ = (sin @ cos 0)” 
ie ‘ 
- (5 sin(26)] 


5 gem 
z sin (20). 


Then, we have 
sin? cos? oa = i | sint20a0. 


That’s still not good enough, because we don’t know how to integrate sin’. We have to do more trig 
manipulation to reduce the number of powers of trig functions, by noting that cos2x = 1-2sin?x. So 


152 


5.3. INTEGRATION METHODS 


sin? x = 3(1 — cos 2x), and thus we have 


z | sin?20<0 = 5 { -cos40yao. 


Now we can integrate: 


1 tf, 1. es 
5 { @-c0s46)d0 = 5 (0 Fsin4o) + C= 50 - 5 sind +c 


5.3.4 Integration by parts 


We recall the Product Rule for derivatives: 


(fs) = fg’ + f's. 
This may lead us to wonder if there is a similar product rule for antiderivatives. The answer is “sort of,” as we 
can see in the following example: 


Solution for Problem 5.24: We can’t do much with this using the techniques that we already have. Letting u = x or 
u = e* doesn’t help at all. But noticing that we have a product gives us an idea. 


We know that 
(xe*)’ = x(e*)’ + (x)/e* = xe* + &*. 


Rearranging, we have 
xe* = (xe*) —e. 


Now we can take the antiderivative of both sides: 


freac= [(aey -eyar= [wey ar fede =re 2 +C. 


In the last step, notice that we used the “trivial” antidifferentiation rule f jf =F+C o 


“Undoing the Product Rule” as in Problem 5.24 is the general idea of the technique called integration by parts. 
Suppose that we want to find the antiderivative of something of the form uv’, where u and v are functions. We 
can write out the product rule for the derivative of uv: 


(uv) = uv’ + u'v. 


Solving for uv’ gives 
uv’ = (uv) — u'v. 


So the antiderivative of uv’ is just the antiderivative of (uv)’ — u’v. That is, 


fw dx= f (oy — u'v) dx = uv — fue dx. 


Typically, we write the above equation using our usual notational shortcuts du = u’ dx and dv = v’ dx, giving: 
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Important: Integration By Parts 


f ao =uv-— f eau 


Let's revisit Problem 5.24 using this new notation. We set u = x and dv = e* dx. Then du = dx (the derivative of 
u) and v = e* (the antiderivative of dv). Thus, our integration by parts formula is 


[retdx=xe'- [eas 


The advantage of this is that we know how to compute the antiderivative on the far right: it’s just e* +C. Therefore, 
ff xe* dx = xe* -—e* + C = (x-1)e* +C. 


This is the general strategy for applying integration by parts: if we have an integral that is difficult to evaluate, 
but which has a part (denoted by dv) that is easier to integrate, then we can try applying integration by parts. 
This is especially useful if the other part of the integral—the u part—has a simple derivative. Keep this general 
strategy in the back of your mind as we work through the following examples. 


Problem 5.25: 
(a) Compute | xsinxdx. 


(b) Compute ‘i xe dx. 


Solution for Problem 5.25: 


(a) Often, a tricky step in using integration by parts is deciding which part should be u and which part should 
be dv. If we make the wrong choice, we can make things worse: 


Bogus Solution: Letu =sinx and dv = xdx. Then du = cos xdx and v= bx, and integration 
by parts gives us: 


f ssinxde = 53? sinx— [ 52? cosxdr. 


Although our calculation above is correct, we see that we’ve made our integral more complicated, because 
now we have to integrate a quadratic term times a trigonometric function. That’s because we picked the 
parts incorrectly. 


Try to pick u and dv so that du and v are simpler, if possible. This usually means 


we don’t want to pick polynomial terms to be in dv, since their antiderivatives are 
more complicated. 


In our problem, we should set u = x and dv = sinx dx. Then du = dx and v = — cos x, and our integration by 
parts formula gives us: 


f ssinxax = xcosx— | (~ cos) dx = -xcos: + sin + C 


(b) We set u = x? (so that du = 2x dx) and dv = e* dx (so that v = e*). Then integration by parts gives us: 


fPetdx=te— [axe as 
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We know how to do the latter integral: it’s integration by parts again! We let u = 2x and dv = e* dx, so that 
du = 2dx and v = e*. Thus, the entire computation is 


ff Peax=s2e- [axe ax 
= tet —(anet- [ 2e'ax) 


= x*e* — 2xe* + 2e* +C 
= (x7 — 2x + 2)e* +C. 


QO 


Let’s again revisit Problem 5.24, but this time let’s make it a definite integral and see how that works: 


Problem 5.26: Compute aM xe* dx. 
0 


Solution for Problem 5.26: As before, we apply integration by parts by letting u = x (so that du = dx) and dv = e* dx 
(so that v = e*). Then integration by parts gives 
2 2 
= { e dx. 
0 Jo 


2 
£ xe“ dx = xe* 
0 


Notice that the uv term must now be evaluated at the limits of integration. We continue: 


2 . 2 
[seta = xe - [eax 
0 5 0 


2 


= (2e* — 0e°) — e* 
0 

= 2c” — (e* — e°) 

=e+1. 


| 


Problem 5.27: Compute f e* sin x dx. 


Solution for Problem 5.27: It’s not clear which part we should set equal to u and which part to du: they both have 
“easy” derivatives and antiderivatives. So let’s just try one and see what happens. 


If we set u = e* (so that du = e* dx) and dv = sin x dx (so that v = — cos x), then we have 
[ etsinxdr = -e'cosx+ fe cosxdr. 


This doesn’t seem all that helpful, since we still have an integral that we don’t know how to compute. But it’s very 
similar to the integral that we started with. So let’s try integration by parts again. Let u = e* and dv = cos x dx; 
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then du = e* dx and v = sinx, so we have 


[ etsinxdr =-e'cosx+ fe cosxdx 
= -e'cosx + (e*sinx fet sin xd) 


= e*(sin x — cos x) — fe sin x dx. 
We seem to be back where we started, with f e* sin x dx on the right side. Are we stuck? 
No! If we move the integral on the right side over to the left side, we have 
) fe sinx dx = e*(sinx — cos x), 


and hence, dividing by 2, we have our answer: 


[ etsinxax = ele e —OeF) . 


(Note the C that we add at the end.) Oo 


Integration by parts also lets us compute the antiderivative of a “simple” function: 


Solution for Problem 5.28: This integral seems a little hard to grasp. What are the “parts”? There seems to just be 
one term: logx. So what can we set to be u and dv? We don’t seem to have much choice for u: since we don’t 
know how to integrate log x, we have little choice but to set u = log x, so that du = + dx. What does that leave for 
dv? All that’s left in the integral is dx, so we set dv = dx. This makes v = x, and our integration by parts calculation 


gives us: 


frogxax =xtogx— f (2) dx 
= xlogx- f 1de 


=xlogx-x+C. 
As a check, note that 
4 (wlogs -x)= (=) + 1(logx) -1=1+logx—1=logx. 


Integration by parts is nothing more than a rearranged version of the Product 
Rule for derivatives. If you can remember 


(uv) = uv’ + uD, 


you can just rearrange it to 
uv’ = (uv) — u’v 


and then integrate this formula to get 


ff udo=wo- f vdu. 
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5.3.5 Substitution methods 


The Chain Rule is one example of a substitution method for evaluating antiderivatives. There are other more 
sophisticated sorts of substitutions that we can make that let us evaluate more difficult integrals. All of the 
substitutions follow the same basic rule: 


Important: _ If we substitute for x in f f(x) dx, then we have to substitute for dx too. 


The methods that we will discuss in this subsection are essentially “backwards” versions of the Chain Rule 
substitutions that we studied earlier. Before, we would take an integral of the form 


{ SF (x) f' (x) dx 


and make the substitutions u = f(x) and du = f’(x) dx to get an integral of the form 


f g(u) du. 


However, now we are going to consider substitutions in the other direction. Specifically, we’ll start with an 


integral of the form 
t. Q(x) dx 


and make a substitution of the form x = f(u). This gives dx = f’(u) du, so after substitution, our integral becomes 


f seronp ad 


This seems strange—it looks like we’ve made our integral a lot more complicated! But for the right type of 
function, this sort of substitution actually makes things work out nicely. 


Let’s see some examples. Our first example is a substitution that might seem quite surprising! 


Solution for Problem 5.29: We don’t have any good way to deal with that ugly denominator. But the expression 


V1-—.x* may remind us of the unit circle, and the unit circle should remind us of trig functions. So we try the 
substitution x = sin @. We note that this substitution makes sense, because the function that we’re integrating is 
only defined for —1 < x < 1, and sin @ contains the interval (—1, 1) in its range. 


This means that dx = cos 8 d@, so we have 


< 1 res cos 0 do 
Vi-x V1 - sin? 0 


Note that the cos 0 term in the numerator comes from the conversion of dx to dé. 


Now the denominator simplifies nicely! We have 
V1-sin* @ = Vcos? 0 =|cos OI, 
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so our integral becomes 
cos 8 


|cosO| 


This looks very nice except for that annoying absolute value sign. Perhaps we can make it go away.... 


We can! Notice that in the substitution x = sin 0, we can restrict to 0 € (-3, z) and we get x € (—1,1). But on 
the interval (-3, z), the cosine function is strictly positive. So we can replace | cos 6| with just cos 0. Hence, our 


integral is simply 
f SGa0- f 1do-0+c 
cos 6 


But we don’t want the final answer in terms of 0, so we make the reverse substitution 6 = arcsin x, to get the final 
answer of 


re 2 dx = arcsinx + C 
V1 —x? 


Again, note that the domain of arcsin is (—1, 1) and its range is (-3, x), so all of our substitutions are consistent. 0 


But wait a minute. What if we had instead made the substitution x = cos 9 over the interval 0 € (0,7)? Then 
we'd have dx = — sin 0 d@, and the integral becomes: 


{4 ax= { -—Se do 
v1 -—x2 V1 -—cos? 0 


sin@ 
z {-352 


= -0+C=-arccosx+C. 


How can this be? Is the integral equal to both arcsinx and —arccosx? It can’t possibly be, since any two 
antiderivatives must differ by a constant, right? 


There’s no problem, because arcsin x and — arccos x do differ by a constant! Recall that sin(} — 0) = cos 0. This 
implies that arcsinx = } — arccosx for any —1 < x < 1. (Note that there are also some issues with the ranges of 
arcsin and arccos, but these also amount to merely adding or subtracting a constant.) 


In general, we can use substitutions—trigonometric or otherwise—to make “ugly” terms in integrals come out 


nicer. A common use is as in the last problem: when we have a term of the form 1 — x”, especially V1 — x2, we use 
x = sin 0 or x = cos @ so that we can apply the trigonometric identity sin? 9 + cos? @ = 1. 


We'll reiterate a subtle point about this substitution: we can only make the substitution x = sin @ if x only 
takes on values between —1 and 1, because those are the only values in the range of sin 8. But when we make a 


substitution for x in V1 —.x?, we must have 1 — x > 0 (in order to take the square root), so we must have x? < 1, 
giving —1 < x < 1, which makes the trig substitution valid. 


The next problem is similar and also quite important: 


Solution for Problem 5.30: This is very similar to Problem 5.29, except we have 1 + x? in the denominator instead 
of V1 —.x*. One thing to note is that the domain of x is all of IR, so we know that we can’t make the substitution 
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x = sin 6 (since that only takes values between —1 and 1). Fortunately, we do know a trig function that has a range 
of all the reals: tangent. 


We let x = tan @. Then dx = sec? 6 d@, and we have 
1 sec? 0 
Jee J Fare 


This may look ugly, but actually we’re in great shape. We use the identity 1 + tan? @ = sec? 0, so our integral 
becomes ade 
6 


1+ tan? 0 
As always, we want to put this back in terms of x = tan @. This means that 0 = arctan x, and we have 


a9= [ 1d0=0+¢. 


1 
f pepe arctan +c 


Finally, we compute the definite integral: 


1 1 
{ — 5 dx = arctanx| = tan“! 1-tan10 =~. 
0 1+x 0 4 
Note that we could have also written the entire calculation, including the substitution, in terms of a definite 
integral, as follows: 
1 T/ 4 /4 
1 sec” 0 f n 
—— ix = —— d9 = dd=-. 
if 1+ x? f 1+ tan? @ 0 4 
O 


The integrals in the previous two problems are fairly common; you should learn them. 


Here’s a more complicated example of this: 


Solution for Problem 5.31: We have a V1 — x? term, so we think to use the trig substitution x = sin@. This gives 
dx = cos 6d@, so the integral becomes 


[eviqsax= [ sin? Vi sin®Ocosaao = | sin’ dcos? 040, 
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There are a number of ways to continue from here. One method is to use the identity sin? 6 = 1 — cos* @ on two of 
the sine factors, giving 


f sin’ O cos? 6 dé = [ (cos? @ —cos* 0) sin @ dé. 


Now we can make the substitution u = cos 0 and du = —sin 0 d0, giving us 
[ (cos? 0 —cos* @)sin@ dé = [-w —u')du = fot —u?)du = zu - ze +C. 
Finally, we use the fact that u = cos@ = V1-sin*@ = V1 — x2, giving the solution: 


[evi= Har s(1-2)! “ z1-2)! +C. 


Alternatively, noting that we had u = V1 -— <x? at the end of our calculation above, we could have made this 
substitution to begin with. Then wu? = 1 — x”, so 2udu = —2x dx, hence —u du = xdx, and we have 


[8virsar= fas u-andu = f (ut —w2)du = Bu — Ha? += 50-2)! - Fae) +¢ 


O 


Although trig substitution is by far the most commonly used substitution method, some functions can be 
integrated with an unusual substitution, such as the following example: 


ex 


dx. 
Vite 


Problem 5.32: Compute f 


Solution for Problem 5.32: There’s no obvious way to attack this integral. Substituting u = e* might work. This 
gives du = e* dx, and we have 


e u 
——- dx = | ——du. 
Vl+e Vit+u 
Now we can evaluate this by parts. We let u = u (convenient!) and dv = (1 + u)~2du, so that v = 2(1 + u)?. This 
gives us 


[ Resa vieu—2 [Vir udn 


1+u 
=u ViFu- 51+) +C 


To finish, we replace u = e* to get 


ex 
a has 


he , 
dx = 2e*(1 + e*)? — -(1+e)2 +C. 
S seeere ra ret—jasey 


Alternatively, we could use a more creative substitution on the original integral. Let’s get rid of the whole 


messy denominator at once, by making the substitution u = V1 + e*. Now weneed to convert e* dx into something 
in terms of u. To isolate e, we “solve” our substitution: 


u=Vl¢e 3 @C=U-1. 


Thus, the numerator of our integral will become (u? — 1)?. 
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We also need to convert dx into something involving du. We continue solving for x, so that we can compute dx: 


2u 
ue — 


==) = x=loghP = 1)" eeods= 7 au 


Now we can write our integral in terms of u: 


|  ¢ae-1P 2 
{Seca ae a 


How nice! A lot of things cancel, and we’re left with just 


[202 —1)du= ou —2u+C, 


and converting this back into x gives us our answer: 


e 2 3 1 
dx = =(1+e*)2 —2(1 +e)? +C. 
i 3 


The answers that we got from our two different methods may look different, but if you check them carefully 
(and rearrange some terms), you will see that they are the same. 0 


5.3.6 Partial fractions 


The technique of partial fractions is designed for integrals such as the following: 


Solution for Problem 5.33: We can see that a trig substitution won't help. In particular, since x can be any real 
number except for +1, the only possibilities are x = tan 0 or x = cot 0, but the quantities tan? 9 — 1 and cot* 0 - 1 
are not particularly nice. 


However, we notice that we can factor x* — 1 = (x — 1)(x +1). Surely that must help. Indeed, that factorization 
lets us rewrite the function in terms of simpler fractions: 


| a a 


2-1 2\x-1 = x+1 
This is very helpful, because those simpler fractions have easy antiderivatives. This lets us easily finish the 
problem: 
1 1 1 1 
{wae s(f ae- fa) 
= (log |x — 1) —log |x + 1)) +C 
1 baal 
3 log roe Aa 4 
O 


Problem 5.33 is a basic example of the technique of partial fractions. The general idea is that we break up a 
complicated fraction into simpler pieces that are easy to integrate. Here is a slightly more complicated example: 
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x+2 
Problem 5.34: Compute f git, wk dx 


Solution for Problem 5.34: If we were lucky, we could try the Chain Rule substitution u = x? — 2x — 3. But then 
du = (2x — 2)dx doesn’t match the term in the numerator, so we’re stuck. 


However, we notice that x? — 2x — 3 factors as (x — 3)(x + 1), so we can try partial fractions. We hope to write 
x+2 A B 


—— = —— + —, 
x2-2x-3 x-3 x+1 
where A and B are suitable constants. 


It is an easy algebra exercise to solve for A and B. One method is to place the right side of the above expression 
over a common denominator: 
he. yA Buon AG@i+ 1)+ Bla —3) 
x2-2x-3 x-3 xt+1 x2-2x-3 ’ 


so that we must have 

x+2=A(x+1)+ B(x -3). 
We can then match coefficients to get the system of equations A + B = 1 and A — 3B = 2, which gives A = 3 and 
B = —}. Alternatively, we can plug in x = —1 to the above equation to get 1 = B(—4), hence B = —}, and we can 
plug in x = 3 to get 5 = A(4), so A = 3. 


x+2 & Pel Lifted 
fe -3 2-7 fa 


Both of the integrals on the right side of the above equation give us logarithm terms in our final answer: 


Thus, our integral becomes 


ce 4 5 1 
{ae q loglx — 3] — 7 log|x + 1] +C. 


Some people might prefer to combine the logarithm terms into one expression to get a more compact final answer: 


x+2 ait (x — 3) 
{Bee hd 


+C. 
x+1 


Oo 


5.3.7 A monster example 


We'll finish this long section on antiderivative methods with an example that combines a few different techniques. 


Problem 5.35: Compute if sy ae 


Solution for Problem 5.35: There appears to’ be no clever substitution that seems to help us. But we have a 
factorization of the denominator as (x° — 1) = (x — 1)(x? + x + 1), so we can try for a partial fraction decomposition: 


5 2? ? 
— = —— + 
eB-1 x-1 x4+x+4+1 


We can try for constant terms in the numerators, but that runs into problems: 
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Instead, we will need a linear term in the numerator over x? + x + 1, otherwise we have no hope of canceling the 
quadratic factor that will result when we place the right side over a common denominator. Specifically, we will 


have 
3 A Bx+C 


2-1 x-1 4x41’ 
where A, B, C are constants that we need to find. Placing the right side over a common denominator gives 


3 _ A@? +x +1) + (Bx+O(x-1) b (A + B)x* +(A-—B+C)x+(A-C) 


#-1 xe -1 xe =-1 
Matching the corresponding coefficients of the numerators gives us the system of linear equations 


A+B=0, 
A-B+C=0, 
A-C=3. 
Solving this yields A = 1, B = —1, and C = -2. 


Thus, our partial fraction decomposition is 


f 8 52 . (—; | ix 
B-1 0 J \x-1 x4x41) 
The first term of the integral on the right side above is straightforward: 


nl 
f Rae=togie- 11+ 


x2 - a dx? The denominator doesn’t factor any further, and trying the Chain Rule 


with u = x* + x + 1 gives du = (2x + 1) dx, which doesn’t match the numerator. 
But we can force this substitution to match at least partially, by writing 


+2 Ll. yea 3 1 


But how do we compute 


Now the Chain Rule does work on the first summand: 


1241 ldu 1 1 
f Set are f FS = frown + C= Flog? 4x4 11+. 
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The term that remains is 


>{ame 
2J x*4+x+4+1 


This looks a little bit like an arctangent, but there’s that annoying linear term in there. We can make it look more 
like the derivative of an arctangent if we complete the square of the denominator: 


1 1 


2 x 2 : 
x*+x4+1 (x+) +3 


We evaluate the antiderivative of this via a trig substitution. We want the denominator of the above fraction to 
look like a constant times tan? 0 + 1, so we let x = ¥ tan 0 — 3. Noting that dx = ¥ sec” 0, we get 


1 1 B sec? 9 
fam ax = { —— a= [ 2 . do 
x*+x4+1 (x +4) +3 7(tan* 6 + 1) 


But tan? 6 + 1 = sec? 0, so this is just 
fqe-5 = i= 54 Cc 
¥3 


Undoing the substitution, we see that 0 = tan™! (3 (x + 1) so we get 


lamer ae = (x+5) #¢ 
V4+x4+1 °° 83 v3 2 , 


Putting all of these steps together, we get our answer: 


x+2 
{5 ax= [ nid -{/ 4a 


oes pt a-5f 1 


x24+x4+1 2) @exel 
1 3 1 
= loghe - 11- Slog? +4 1]- 5 f —~—dr 
(x + 4) +3 
= log =11= Ftogh? +x+1)- VStant(Z (r+ 3))+c 
2 V3 2 


O 


This last problem combined several nontrivial integration techniques. Also worth noting is that we started 
with the relatively simple function 3%, and ended up with a really ugly-looking antiderivative. This is quite 
common: relatively “nice” functions can have really horrible antiderivatives. 


While it’s good to slog through an integral like the one in Problem 5.35 at least 
once in your life, in practice it’s much easier to just have your favorite calculator, 
computer, or website do it for you. It’s just like in elementary school, where 
at first you learned how to multiply three-digit numbers by hand, but quickly 


started using a calculator instead. Learning to do the process by hand at first will 
help you learn the deeper meaning behind it, but there’s no reason not to take 
advantage of the technology that is available. 
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EXERCISES 
5.3.1 Compute the following integrals: 
(a) foe-x+2¢0 (b) f (8-2-5) a () f(@sinx - 30s) ax 
(d) { <a ei Ae if on™ dx (f f ise wad 
(s) {sin 00s 040 (h) fase (i laa 
(j) ¥ xe dx (k) i 2x*e* dx (1) f xlog x dx 

1 1 1 
o | ape © Jaan © | a-an* 
© {Bye @ [ile @ [pam 


(s) vere: (t) i eee (u) [v2 via 


Hints: (b) 7 (d) 49 (e) 181 (f) 223 (g) 180 (i) 107 (j) 104 (k) 182 (1) 136, 34 (m) 294 (n) 214, 145 (p) 160 (q) 126 (r) 231 
(s) 9 (t) 25 (u) 296 


5.3.2 Let f be a differentiable function. Compute f f' (x) f(x) dx in terms of f(a) and f(b). 
a 
5.3.3 
(a) Describe a method for computing 8 sin" x dx and ys cos” x dx, where n is a positive integer. Hints: 149, 134 
(b) Describe a method for computing f sin” x cos" x dx, where m and n are positive integers. Hints: 154, 192 
1 
5.3.4 Compute f x(x- 12 dx. Hints: 57 
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5.3.5* Compute f sin"! xdx. Hints: 202 


5.4 APPLICATIONS OF THE DEFINITE INTEGRAL 


The definite integral has many different applications. Although we will present a number of different types 
of applications, we will not do very many examples of each application. Once you understand the concept, the 
“examples” are primarily just practice computing antiderivatives, at which you should by now be somewhat 
proficient. 


5.4.1 Areas of regions in the plane 


We defined the definite integral in order to compute the area of a very specific region of the plane: the integral 


f ‘ f equals the area of the region below the curve y = f(x), above the x-axis, and between the lines x = a and x = b. 
But the definite integral is a flexible instrument, and we can use it to compute areas of other regions of the plane. 
For example: 


Problem 5.36: Let f and g be continuous functions on [a,b], with f(x) = g(x) = 0 for all x € [a,b]. What is the 


area between the graphs of f and g and the lines x = a and x = b? 


Solution for Problem 5.36: We can see what's going on by looking at a picture such as the one 


to the right. The area under f is I : f, and the area under g is lg g. So the area between the 
two curves is their difference, given by 
f (f — 8). 
a 


Often, we will have to do a little bit of work to determine the proper definite integral, as 
in the following problem: 


oO 


Problem 5.37: Find the area of the region in the first quadrant bounded by the graph of y = cos x, the graph 


of y = sin x, and the y-axis. 


Solution for Problem 5.37: We can see from the picture that we want the area between Y 
cos x and sin x. However, we still need to determine the limits of integration. It is clear 
from the picture that the lower limit of integration is x = 0. The upper limit is the 
smallest positive value of x at which sin x = cos x. This means that tanx = 1, so x = 7. 


Thus, the area is given by the definite integral 


' : 
, (cosx — sinx) dx = (sinx + cosx)| = V2-1. 
0 0 


Problem 5.38: Find the area of the region bounded by the curves y = 1, y= —x +6, and y= yx. 
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Solution for Problem 5.38: We can start by sketching a picture of the region, as 
shown to the right. In order to work with the region, it’s necessary to determine 
the intersection points of the graphs. Some easy algebra will verify that the 
intersection points are as labeled in the picture. So our region ranges from x = 1 
to x = 5. However, the shaded region is not the region between two graphs, as 
in our previous examples. 


We can break up our desired region into two smaller regions. We note that 
for x € [1,4], we have the region between y = yx and y = 1. Similarly, for 
x € [4,5], we have the region between y = —x + 6 and y = 1. The sum of these 
two subregions will give us our entire region. 


[e-naes [ex+6-nae 
di 4 


4 
“($-9)-G-1)-§ 

1 3 3 3 

We could also evaluate the other integral using calculus, but we could instead just observe that it is an isosceles 

right triangle with legs 1, and thus has area h. Therefore, the total area is 3 aa 4 = B. 


Thus, the area that we want is 


The first integral gives us 
(5°?-+) 


A slightly more elegant solution is to reverse the roles of x and y. Thatis, Y 
we think of the curves as graphs of functions of x in terms of y, as labeled in the 
picture to the right. We now see that the region that we want is just the region 
between the graphs of x = —y +6 and x = y’ in the interval from y = 1 to y = 2. 
Thus, we have the definite integral 


ie! ee 31_ 13 


[evra (ote vf =(2-2-2 


as before. 0 


This is an important technique to keep in mind. It is sometimes easier to 
think of x as a function of y, rather than the usual y as a function of x. Of course, 
we can only do this if the function is invertible, or if we can restrict the domain suitably to where the function is 
1-to-1. 


Concept: | When we are working with invertible functions, it may be easier to think of them 
O=== as functions of y instead of functions of x. 


Here’s a harder area-computation example where we use a definite integral: 


Problem 5.39: Determine (with proof) the area of an ellipse with semimajor axis Pe 


a and semiminor axis b. (Semimajor axis means half the distance in the longer 
direction, and semiminor axis means half the distance in the shorter direction, as 
in the picture to the right.) 


167 


CHAPTER 5. INTEGRATION 


Solution for Problem 5.39: We can make this calculation simplest if we center the ellipse at the origin of the plane. 
Then, the ellipse passes through the points (+a, 0) and (0, +b), and is given implicitly by the equation 


xy 
mateo 


However, we can’t directly integrate an implicit equation, so we solve for y in terms of x: 


x2 
y=+by1-—. 


This is not a function (since it produces two values of y for each value of x € (—a,a)), but we get the top half of the 
ellipse if we choose the “+” sign (and the bottom half if we choose the “—” sign). Thus, the area of top half of the 


ellipse is the area under the curve 
x2 
yabyl-3 


on the interval [—a,a], and thus the area of the whole ellipse is twice this area. Therefore, the area is equal to 


2 
2 [ byi- Sax 
a a 


All we have left to do is to compute this integral. The form of this function strongly suggests a trig substitution. 
Indeed, we can use the substitution x = asin 0, so that dx = acos@d@. But we also have to change the limits of 
integration! When x = —a, we have 0 = —7/2, and when x = a, we have @ = 7/2, so the interval [—a,a] in terms of 
x becomes the interval [-3, | in terms of 0. 


Therefore, the area of the ellipse is equal to 
Z 2 ot Oy 
2 by/1- re (acos 6) a6. 
a 


The expression under the square root is just 1 — sin? @ = cos? @, and for @ € [-3, x1, the cosine function is 
nonnegative, so the square root term is just cos 0, and we have 


5 
2ab f cos? 6dé. 
a 


(We have moved the constants a and b to the front of the integral.) At this point, there are a number of different 
ways that we could finish the problem. One method is to use the trig identity cos? 9 = $(1 + cos 20), making our 
integral 


3 
ab (1 + cos 26) dé. 


-% 
2 


(Note the } cancels with the 2 that used to be out front.) This can be broken up into a sum of definite integrals: 


ab { 1d0-+ab [ cos20a0. 
2 Ty 
2 2 


The first definite integral is just 7 (since the length of the interval [-3, | is 7), so the first term is mab. We could 
evaluate the second integral using the Fundamental Theorem: 


2 1 2 wy 1 
[ cos 20d0 = =sin 26| = —(sin(7) — sin(—72)) = =(0 — 0) = 0. 
E 2 sy 2 2 2 


z 
2 
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But we didn’t need to compute it explicitly to see that it would be 0. Instead, we notice that the interval [-3, = 


is an entire period of the function cos 26 (which is periodic with period 7). Looking at the graph of y = cos 2x 
for x € [-3, a, we see that the area under the graph of the part above the x-axis will exactly cancel with the area 
under the graph of the part below the x-axis, leaving a net area of 0. 


Thus, the area of the ellipse is mab. Note that if our ellipse satisfies a = b = r, then the ellipse is in fact a circle 
of radius r, and we get our usual area of mr”. O 


5.4.2 Volumes 


Although we defined the definite integral to compute areas, we can go up one dimension and use them to 
compute certain volumes as well. Recall that, for computing areas, we constructed a Darboux (or Riemann) sum 
by breaking up our region into a sum of rectangles, where we used a partition of an interval to determine widths 
and a function to determine heights. For volumes, we'll form a Riemann sum by breaking up a three-dimensional 
solid into a sum of slices, where we'll again use a partition of an interval to determine the widths of the slices, but 
where a function will determine the cross-sectional area of each slice. 


Let’s begin with an example for which you probably already know the answer. 


Problem 5.40: Determine the volume of a pyramid of height 5 with a square base of side length 4. 


Solution for Problem 5.40: We'll try a 3-dimensional version of the same 
strategy that we used to find area under a curve. Specifically, we'll 
approximate the volume of the pyramid by breaking it up into a sum of 
rectangular 3-dimensional solids (called cuboids), as shown to the right. 
We partition the height interval [0,5] into 


O'= XS X21. <0 <o* SX, = 5. 


At distance x; from the vertex, we will construct a cuboid whose base 
is a square of area A(x;) (where A is a function that we haven't yet 
determined) and whose height is x;,1 — x;, as shown in the picture to 
the right. This cuboid thus has volume A(x;)(xi+1 — xi). If we sum the 
volumes of the cuboid over our entire partition, we have 


n-1 
Volume ~ Colca — xj). 
i=0 
This is a lower Darboux sum of the function A(x) on the interval [0,5], so if we take the limit as the sizes of the 
pieces of the partition become arbitrarily small, we get 


Volume = { A(x) dx. 
0 


What remains is to determine the function A(x) that gives the area of the cross-section that is x units from the 
vertex. We know that the side length of the cross-section varies linearly, from 0 at x = 0 (this is the vertex of the 
pyramid) to 4 at x = 5 (this is the base of the pyramid). Thus, the side length of the cross-section at height x is 
4x. Therefore, the area is A(x) = (x) = 38x". Now that we know A(x), we can evaluate the definite integral and 
compute the volume: 

16 4, 


é- 2 Pita gy ie . ey © 
Volume = [Agar = f aah cing yo eke 3° Oo 
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The technique of Problem 5.40 can easily be generalized to an arbitrary pyramid. 


Problem 5.41: Compute the volume of a pyramid (or cone) with a base of area b and height h. 


Solution for Problem 5.41: Note that we’re not assuming anything about the shape of the base. Our only basic 
assumption is that the cross-sectional area is proportional to the square of the distance from the vertex. In 
particular, let A(x) denote the area of the cross-sectional slice that is x units from the vertex of the pyramid; we 
are given that A(0) = 0 and A(h) = b. We note that each cross section is similar to the base, which means that the 
cross-sectional area that is x units from the base satisfies the proportionality equation 


because the square of the ratio of lengths gives us the ratio of corresponding areas. Thus 


A(x) = »(2). 


We will use these cross-sectional areas to estimate the volume of the pyramid. Just as we did in Problem 5.40, 
we'll start with a partition of the interval [0, h]: 


Oty tp Sey ih: 


We break up the pyramid into prisms, where each prism has a height given by the size of the corresponding piece 
of the partition, so the height of the ith prism (for 0 < i < n) is xj,; — x;. The area of the base of this prism is given 
by A(x;), the cross-sectional area of the pyramid at height x;. Thus, the volume of the i** prism is 


-\2 
Volume of i” prism = (Area of base) - (Height) = A(x;)(xi+1 — xi) = o(=) (Xi+1 — Xi). 


Summing these volumes over our partition gives 


n-1 n-1 
Sum of volumes = is A(x;j)(Xi+1 — Xi) =y (= iy (Xi+1 — Xj). 
i=0 i=0 


But this sum is exactly a lower Darboux sum for the definite integral of the function A(x) on the interval [0,h]. 
Therefore, as the pieces of the partition get smaller and smaller, the volume approaches the definite integral 


[acar= ['o(?) dx 


We can easily evaluate this integral to get our volume: 


% 
fol (=) Jane (ranb-e a = gbh. 


O 


We can use the method of Problem 5.41 to compute volumes of more general regions. The only essential 
property that we used was the existence of a function A(x) for the cross-sectional area at x. Thus, we can 
generalize: 


5.4. APPLICATIONS OF THE DEFINITE INTEGRAL 


Solution for Problem 5.42: We can divide the interval [a, b] into a partition 
B= Xp SAS os <= 


We'll use this partition to break up our solid into smaller prisms, where the i‘* prism has its bases in the planes 
x = x; and x = Xj41. If we choose a point w; € [X;,Xi4+1], then we can approximate the volume of the piece of the 
solid lying within the interval [x;,x;,1] as the volume of the prism with height x;,; — x; and cross-sectional area 
A(wj). Thus, the volume of the solid is approximated by 


n-1 


Yin — 2A (wi). 


i=0 


But this is a Riemann sum! As we let the sizes of the pieces of the partition get smaller and smaller, this Riemann 
sum approaches a definite integral. Thus, the volume of our region is given by the definite integral 


F A(x) dx. 


The volume of a solid between the planes x =aand x = b, with cross-sectional 
area given by the function A(x), is 


vs th eae 


This should be thought of as “summing the areas” over the interval [a,b]. 


Important: 


This method of computing a volume is sometimes called slicing. We are essentially summing up the areas of 
cross-sectional slices. This technique is very flexible and can be used in a variety of different volume computations. 
For example: 


Problem 5.43: Find the volume of a sphere of radius r. 


Solution for Problem 5.43: We can think of the sphere as sitting with its center at 
the origin of three-dimension space. As x varies from —r to r, the cross-section 


of the sphere is a circle with radius Vr? — x?, and thus has area 7(r* — x”). Thus, 
the volume of the sphere is given by the definite integral 


[ne -2)ar=n{ x-5) =n((°-5)- (r1- r)- eas wy) = Sa 


O 
There are a couple of common scenarios in which we use the graph of a 
function to get a three-dimensional solid. 


Problem 5.44: Suppose that we start with the graph of y = f(x) between [a,b]. We rotate this graph, in three 
dimensions, around the x-axis to get the surface of a three-dimensional solid. Find the volume of this solid of 


revolution. 
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Solution for Problem 5.44: The cross-section of the solid at any given value of x is a 
circle of radius f(x). Thus, the cross-section has area 7(f(x))*. Therefore, using our 
general principle of “summing the areas” to get volume, we conclude that the volume 


is given by the definite integral 
n a (f(x))? dx. 


y 


Important: The volume of the solid of revolution given by revolving the graph of 
Vv y = f(x), on the interval [a,b], around the x-axis is 


Tt 1 (f(x)? dx. 


Most calculus courses teach the formula for the volume of a solid of revolution 
as a formula to be memorized. However, if you understand this formula as just 
a special case of the general slicing method, then you won’t need to memorize 
it—it should be obvious to you! 


Solution for Problem 5.45: The solid is a bit difficult to visualize—an example is shown to the left below. The y-axis 
is the axis of rotation, and passes through the hollow center of the solid. A “cross-section” of this object, for a 
specific value of x, is the surface of a cylinder. Two such cylinders are in the picture to the right below. 


At each value of x, the cylinder has circumference 27x and height f(x), so it has surface area 27x f(x). Hence, 
the volume of the solid (as x ranges from a to b) equals the definite integral of the cross-sectional surface areas as 


x ranges from a to b, and is thus 
2n 1: xf (x) dx. 
a 


This method of computing volume is sometimes called the cylindrical shell method. 
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Important: The volume of the solid of revolution given by revolving the area under the 
Vv graph of y = f(x), on the interval [a,b], around the y-axis is 


Solution for Problem 5.46: We can imagine our sphere with the hole cut along the y-axis, 
as shown in the picture to the right. The top edge of the hole produces a right triangle 
with hypotenuse r (the radius of the sphere) and leg a (the radius of the hole), so the 
distance from the center of the sphere to the top edge of the hole is Vr? — a?. Thus, 
our integral will sum the areas of the cross-sectional slices of the solid, perpendicular 
to the direction of the hole, from — Vr? — a? to Vr? — a? (where this quantity measures 
the distance of the cross-section from the center of the sphere). We can make the 
calculation slightly simpler by noticing that the top half of the object has the same 
volume as the bottom half of the object, so the total volume is just twice the volume 


from y = 0 to y= Vr? -a?. 


To find the cross-sectional area of the slice at height y, we observe that this cross-section is an annulus: a circle 
with a smaller circle removed. The smaller circle has radius a. The radius of the larger circle is the leg of a right 


triangle with hypotenuse r and other leg y, so its radius is /r? — y?. Thus, the area of the annular cross-section at 
height y is the difference in the areas of the two circles, which is: 


n( P=) - na =n(P- ay), 


The total volume is the sum of the areas of the annular cross-sections, given by the definite integral 


VP=ae 
2[ mr — a? — y*) dy. 
0 
(Don’t forget the factor of 2, since the given integral is only for the top half of the object.) This evaluates to 
2n ((r — ay — | 
3” Jo 


As a check, note that when a = 0, we get the entire sphere’s volume of anr°, and when a = r, we get 0, as expected. 
O 


r—q? 


= 2n ((r -@i- 3(7 a a)!) " snr ~@)i, 


All of these volume methods are essentially the same. They all depend on the idea 
that to compute volume, we sum the areas of cross-sections along an interval. So 


you don’t need to memorize any of the “special” volume formulas, ‘since they all 
derive from the same basic principle. 


5.4.3 Length of a curve 


Let’s now look at the problem of computing the length of the graph of y = f(x) on the interval [a,b]. If the curve 
is sufficiently simple, then this is easy. For example: 
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Problem 5.47: Compute the length of the graph of y = 2x — 3 on the interval [1,4]. 


Solution for Problem 5.47: This curve is just the line segment from (1,—1) to (4,5), which has length 
V32 +62 = V45 = 3 V5. 


a) 


Problem 5.48: Compute the length of the graph of y = V4 — x? on the interval [1,2]. 


Solution for Problem 5.48: The domain of the function is [—2, 2], and the entire graph (for x € [—2,2]) is a semicircle 
of radius 2. The circumference of the semicircle is 2. This might lead to a quick “solution”: 


Bogus Solution: _ The interval in question, [1,2], is 4 of the entire domain, so the length of 


the curve on [1,2] is } of the total length on [-2,2], or }(2m) = §. 


This is not correct! Just because [1,2] is one-quarter of the domain does not mean 


y 
we get ; of the total length of the curve. If we look more carefully at the diagram at 
right, we will see the correct procedure. The section of the circle from x = 1 to x = 2 
is the arc subtended by the angle @ as shown in the diagram. This angle satisfies / 
cos @ = } (using the triangle shown), so @ = %. This implies that the arc is } of the 

complete semicircle. Thus, the correct answer is that the length for x € [1,2] is } of | -2 2 


the circumference of the semicircle, or an oO 


In general, it is hard to compute the length of an arbitrary curve. But we know that computing the lengths of 
line segments is really easy, as in Problem 5.47. Thus, our strategy will be to break up the curve into small pieces, 
approximate each small piece of the curve with a line segment, measure the lengths of the line segments, and sum 
these segment lengths. 


Problem 5.49: Let f be a differentiable function on [a,b]. We wish to determine the length of the curve y = f(x) 
from x = ato x = b. 


Partition the interval [a,b] as 
A= xX <1 Sh -< S, 


What quantity approximates the length from (xj, f(x;)) to (xi+1, f(%i+1))? Sum these approximate lengths 
to get an expression that approximates the length of the curve. 
We'd like the sum from part (a) to look like a Riemann sum of the form 


n-1 


ae g(Wi)(Xi+1 — Xi) 
i=0 


for some function g and some «, € [X;, Xi+1]. How can we manipulate our sum to be in this form? 
Write a definite integral for the length of y = f(x) from x =a tox = b. 


Solution for Problem 5.49: 


(a) As usual, we'll start by partitioning the interval [a, b] into smaller pieces: 


R=% <4) << “FSRSS: 
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For each interval [x;,xj;1] with 0 < i < n, a close approximation to the curve is the straight line between 
(xi, f (xi) and (xi+1, f(%i+1)). Its length is 


V (f (xis) — f (xi)? + (ina — xi). 


Thus, our total arc length can be approximated as a sum: 


n-1 


Yi Vries) = Fd? + Geiss — a). 


i=0 
(b) Unfortunately, our sum above doesn’t exactly look like a Riemann sum, so it’s not immediately clear that this 
quantity becomes a definite integral as we let the partition pieces get smaller and smaller. 
Here’s the trick: what do we know about the difference of f(xj,1) and f(x;)? How can we approximate it 


or otherwise express it? We can use the Mean Value Theorem! Specifically, there is a point w; € [X;, Xi+1] such 


that 
Ff (xi+1) ‘Ws Ff (xi) 


Xi41 — Xi 


f' (wi) = 


which gives 
(Xia — xi) f’ (Wi) = F%i+1) — f (Xi). 


We can substitute this into our sum: 


n-1 n—1 
Y VUfevies) — fay? + Gir — a = YF odin — 2)? + (Rin — 2. 
i=0 i=0 


We factor the (x;41 — xj) term out of the square root: 
n-1 
Y, (Yr war? +1) rier — 20. 
i=0 

Now it looks exactly like a Riemann sum! 


(c) The expression above is a Riemann sum of the function \(f’(x))* + 1 along the interval [a,b]. Thus, as the 
partition widths approach 0, the sum becomes a definite integral. Therefore, the length of the curve y = f(x) 


along the interval [a,b] is 
e VF)? +1dx. 


Important: The length of the graph of y = f(x) from the point (a, f(a)) to the point (b, f(b)) 
is given by . . 


vf VUr@yr+idx. 


Let’s go back to our circle example from Problem 5.48 and verify that this gives what we expect: 
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Solution for Problem 5.50: Let f(x) = V4—x2. Then f’(x) = , and the length is given by the integral 


5 
[WS ) +14. 
er x2 

2 / 2 
% 
h To + lax 


Writing the integrand with a common denominator, we get: 


I Vtee- SY: 


is sin”! u, so our definite integral is 


This simplifies to 


= (x) 
(5) 


The antiderivative of 


a 
Sager 


Unfortunately, for most functions f, the integral £ V(f’(x))? + 1 dx is rather difficult to compute. This indicates 
that arc length is often a difficult quantity to deal with (although it can be analyzed using the integral approximation 
techniques that we will study in Section 5.5). 


fens) =2(8-2)= 2 


O 


5.4.4 Average value of a function 


A key way to think of a definite integral is as a “continuous sum.” (That’s why the integral sign looks like an 
elongated letter S.) We have already used this interpretation a number of times—for instance, to compute volume 
as a sum of cross-sectional areas, or to compute arc length as a sum of lengths of line segments. This general 
viewpoint of “a definite integral as a sum” gives us some other uses for the definite integral. For example, we 
know that when we have discrete data—that is, we have n data points for some positive integer n—then the 
average value is the sum of the values of the data points divided by the number of data points. We can use the 
definite integral as a sum to extend this to an average of continuous data: 


Solution for Problem 5.51: For a continuous function f, we think of the definite integral as performing a “sum” of 
the values of the function, and we use the length of the interval as the analog of the “number of data points” that 
we must divide by to get the average. So our candidate for the average value of f on [a,b] is: 


f f(x) dx. 


We can see why this makes sense by thinking in terms of a Riemann sum. As usual, suppose that we partition the 
interval [a,b] into n pieces: 
BE LPR Soe SG =; 
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and for simplicity, suppose that the pieces are equal sized, so that xj.1 — x; = &4 for all 0 < i < n. We can 
approximate the average value of f by choosing some w; € [x;,Xj+1] and then averaging (in the usual discrete 
sense) the values f(w;). This gives us 


n-1 
Average of f ~ ~Y° fw. 
i=0 


Let’s make this look more like a Riemann sum by multiplying inside the sum by (x;41 — xj), and dividing outside 


the sum by the same quantity, which we assumed was equal to eos for all 1: 


-1 
Average of f ~ —: ty (Xi+1 — Xi) f (Wi). 
i=0 


As the widths of the partition get smaller, the Riemann sum approaches a definite integral, and the n’s outside the 
sum cancel; therefore we are justified in writing 


Average of f = _ ‘s f(x) dx. 
a 


Geometrically, you can think of the average of f as the “average” height of the region under f. That is, taking 
the area of a rectangle with base [a,b] and height the average value of f gives us: 


Area = (b — a) - (Average value of f) = (b — a) - _ f f@dx= f f(x) dx, 


which is the area under f on [a,b]. O 


In summary, we have 


Definition: The average value of a continuous function f along an interval [a,b] (witha < b) is 


1 
yf oe 


Problem 5.52: The temperature at time t (given in hours from 0 to 24 after midnight) in downtown Aopsville is 


Here’s a simple application: 


given by T = 10 - -5sin() (degrees Celsius). What is the average temperature between noon and midnight? 


Solution for Problem 5.52: By definition, the quantity we seek is 


a ies a, [Ut 
D . (10 - 5sin()) dt. 


This gives us 
24 


tt 
5 (108 + 5 cos (= NI. 


12 
This is 
5 10 
10+ 7 (cos2n —cos7) =10+ 4 ~ 13.18. 
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This is reasonable: at noon the temperature is 10, at 6:00 it is 15, and at midnight it is 10 again. But the temperature 
stays closer to 15 for a longer period of time. So an average of 13.18 over the time period makes sense. 0 


Note that, in Problem 5.52, if we had asked for the average temperature between 6 a.m. and 6 p.m., we wouldn’t 
need to write an integral: the symmetry of the temperature function about ft = 12 shows clearly that the average 
is 10. 


Closely related to the idea of average value is another interpretation of the definite integral that goes back to 
the Fundamental Theorem of Calculus. If f is differentiable, then for any a < b we can write 


f(b) = f(a) + - f(dt. 


That is, the definite integral of f’ sums up the accumulated rate of change of f. In particular, when we rearrange 
this and divide by b — a (assuming that a < b), we get: 


FO) ~ f@ _ 


— _ f f'(t)dt = average value of f’(x) on [a,b] 


b= 
That is, dar ha is equal to the average rate of change of the function on [a,b], as we expect. 


Concept: _ All of the interpretations of the definite integral that we have discussed are in- 


O==se _ terrelated! They all follow from the general idea of the definite integral as a 
continuous sum. 


EXERCISES 
5.4.1 Find the areas of the following regions: 


(a) The bounded region between y = x* and y = x°. 
(b) The bounded region between y = log x and the line segment connecting (e, 1) and (e, 2). 


(c) The region bounded by the y-axis, y = x*, and y = cos x. Express the area in terms of the positive constant a 


such that cosa = a”. 


5.4.2 Find the volumes of the following: 


(a) The region enclosed by the surface resulting when the curve y = x° on [0,2] is rotated about the x-axis. 

(b) The solid consisting of the region under the curve y = x° along [0,2], rotated about the y-axis. 

(c) The region enclosed by the surface resulting when the curve y = cos x on [0, 77/2] is rotated about the x-axis. 
(d) The region y? < x? - 1 for x € [2,3], rotated about the y-axis. 


(e) The ellipsoid obtained by rotating the ellipse (:) + (4) = 1 about the x-axis. 


5.4.3 Find the lengths of the following curves: 


x 1 
(a) y= 0 4 3,3 between x = 2and x= 4. 
(b) y=3x? —1 between (0,-1) and (4,23). 


5.4.4* Compute the volume of a torus of radius a with cross-sectional radius b, with a > b. (The torus is the 
donut-shaped object that results when a circle of radius b rotated around a line in the same plane as the circle, 
where a is the distance between the line and the center of the circle.) Hints: 194, 137 
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5.4.5 
(a) Let f be a continuous function defined on [a,b]. Show that there exists c € [a,b] such that 


[se dx = f(c)(b — a). 


Hints: 155 
(b) Suppose that g is also a continuous function defined on [a,b] whose range is nonnegative. Show that there 


exists c € [a,b] such that 
f f(x) g(x) dx = f(c) a g(x) dx. 
Hints: 78 


(c) Find an example of continuous functions f and g defined on [a,b], for which the equation in part (b) is not 
satisfied for any c € [a,b]. Hints: 152 


5.5 APPROXIMATION TECHNIQUES 


1 2. 
t 
{ e* dx and { eae 
0 rt 


cannot be explicitly evaluated. For these and other integrals, it is important to have a technique for approximating 
them via numerical methods. This basically amounts to manually computing an approximation of the area under 
the curve. 


Many integrals, such as 


As an example, we will approximate the definite integral 


f (e+) ie 


Of course, we can easily compute this integral explicitly: 


fi-ge+2) a= Cpe ef = +4) (a) F-58 


a a 
This is the area under the curve y = —}x* + 2 between x = —1 and x = 2. 


There are a number of approximation techniques using rectangles. As you know by now, we defined the 
definite integral as a limit of sums of areas of rectangles. We can thus approximate an integral as a sum of areas 
of rectangles. There are several different ways that we can perform this approximation, based on different ways 
that we choose the rectangles. 


Let’s do a bunch of different approximations with n = 6 rectangles. (Of course, the more rectangles that we 
use, the greater the accuracy of the estimate should be.) For simplicity, we'll partition [a,b] into subintervals of 
equal length, so that each rectangle will have width =. We then get a partition 


=X <8 < ko. < Xa < Hex XE <= OH, 


where x; = a+ i=). In our example, [a,b] = [—1,2] and n = 6, so xp = —1, x1 = —0.5, x2 = 0, and so on up to 


Nei=i2s 
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We can use the left endpoint of each interval to determine the height: 


Area ~ ty" fa 


= (o5y(f(-1) + f(-0.5) + (0) + f(0.5) + fl) + f(1.5)) 
= (0.5)(1.8 + 1.95 + 2 + 1.95 + 1.8 + 1.55) 

= (0.5)(11.05) 

='51525) 


The general formula is 


but don’t memorize this formula: think of it in terms of the rectangles. 


Similarly, we can use the right endpoint of each interval to determine the height: 


Area © Pay Fe) 


i=1 
= (0.5)(f(-0.5) + f(0) + (0.5) + f(1) + f(1.5) + f(2)) 
= (0.5)(1.95 + 2 + 1.95 + 1.8 + 1.55 + 1.2) 
= (0.5)(10.45) 
= 5.225. 


The general formula for a right-side rectangular estimate is 


aby F(x). 


i=1 


Notice that the only difference in the formulas for the left-side rectangular estimate and the right-side rect- 
angular estimate is the limits of the summation: for the left-side estimate, we sum from i = 0 to n — 1, but for 
the right-side estimate, we sum from i = 1 to n. Also note the similarity of these sums to our average value 
computation from Problem 5.51; in particular, if we divide these sums by b — a, we get our usual discrete estimate 


for the average value ;+ i f, as expected. 


We also note that the average of these two estimates is (5.525 + 5.225)/2 = 5.375, which is a lot closer to the 
actual value of 5.4 than either of the original estimates. Geometrically, what does this “average” represent? 


If we average the left-endpoint and right-endpoint estimates, we get: 


,= pay es il se Teel 


i=0 


which is sum of the area of trapezoids. This formula is usually written by pulling out the } and combining the 
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other terms: 


i=1 i=1 


n-1 n pol 
, (4 y fe) + (E# y rea) = (Fo + aS fe) + fos 
i=0 


and it is called the Trapezoid Rule. 


Let's see it explicitly with our example: 
y 


5 
Area ~ 78 fc + [29% rs) + fo) 


i=1 
= (0.25)(f(-1) + 2f(-0.5) + 2f(0) + 2f(0.5) 
+ 2f(1) + 2f(1.5) + f(2)) 
= (0.25)(1.8 + 3.9 + 4+ 3.9 + 3.6 + 3.1 + 1.2) 
= (0.25)(21.5) 
= 5.375. 


=] 


This is very close to the actual value of 5.4, and we can see in the picture that the trapezoids very closely fill 
the entire area under the curve. 


There are two other important approximation methods. One of these methods is to use rectangles where the 
height is determined by the midpoint of each interval. That is, 


b-atcst .(xi +X 
n ei | 2 ) 


i=0 


Here’s the picture and computation with our example: 
¥ 


in we BB (ba) 


iz 


= (0.5)(f(-0.75) + f(—0.25) + f(0.25) + f(0.75) 
+ f(1.25) + f(1.75)) 
= (0.25)(1.8875 + 1.9875 + 1.9875 + 1.8875 + 1.6875 + 1.3875) 
= (0.25)(10.825) 
= 5.4125. 


=] 
"This is also a pretty good estimate. 


Finally, we can make an even more accurate estimate by using parabolas, instead of rectangles or trapezoids, 
to approximate the area under the curve. The details of this construction are a bit technical, so we will defer 
these details to Section 5.B. The surprising result is that this estimate turns out to be a weighted average of the 
midpoint estimate and the trapezoid estimate: we weight the midpoint estimate by 3 and the trapezoid estimate 
by 4. Specifically, in our example, this gives us 


5 (6.4125) + 3(5375) =544, 


which happens to exactly match the value of the integral! In general we won’t get the exact answer, but this 
estimate tends to be the most accurate. 
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In practice, we execute this approximation by “overlapping” the two rules via a partition with twice as many 
components. So our new partition uses m = 2n intervals, giving 


con / / eb ee 
A=X% <X, << <H, =D. 


Note that this new partition just splits all of the intervals in our old partition in half: if i is even, then our “new” 
x; is equal to the “old” xj/2, and if i is odd, then our “new” x; is the midpoint between the “old” x(~-1)/2 and Xx(i+1)/2- 


The trapezoid rule with the old partition gives 


Sto {25 Sse } +r] = 85 (106 eS re J+ F640 } 


i=1 


The midpoint rule with the old partition gives 


n-1 n-1 
rot iF Se nat )=* z ee F %di41)- 
i=0 


e i=0 


Weighting the trapezoid rule by } and the midpoint rule by 3 and combining gives 


n-1 n-1 
oe E [res + [2 a 2) + ra) +2), fe) 


It is more clear what is going when we pull } outside: 


n=1 n-1 
= [r (xo) + [2 ); ro) + f(x5,) +4 by FG i4 »} 
i=l = 


This is ugly, but it’s clearer if we write it without the summation symbols, write it in terms of m instead of n, and 
list the terms in order: 


78 (fea) + 4fl0%) + 2f (04) + AFC) + 2f(R4) +--+ Af 4) + fee). 


This is called Simpson’s Rule, and it generally has the smallest error of all the approximation techniques. 


Let’s use all of these techniques on another example: 


i 
Problem 5.53: Compute approximations for the integral fe -* dx (to 4 decimal places) using n = 4 via: (a) 


left-side rectangles, (b) right-side rectangles, (c) the Ieper Rule, (d) midpoint rectangles, and (e) Simpson’s 
Rule (with m = 2n = 8). 


Solution for Problem 5.53: Let f(x) = e~. Here is a chart of the computations: 


Method Computation Result 


Left-side rectangles | 1(f(0) + (0.25) + f(0.5) + f(0.75)) 0.8220 
Right-side rectangles | 1(f(0.25) + f(0.5) + f(0.75) + f(1)) 0.6640 

Trapezoid Rule | 1(F(0) + 2f(0.25) + 2f(0.5) + 2f(0.75) + f(1)) 0.7430 
Midpoint rectangles | 1(f(0.125) + f(0.375) + f(0.625) + f(0.875)) 0.7487 


Simpson’s Rule | 4(f(0) + 4(0.125) + 2f(0:25) + 4f(0.375) +2f(0.5) | 0.7468 
+4f(0.625) + 2£(0.75) + 4f(0.875) + f(1)) 
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The Simpson’s Rule estimate above is accurate to 4 decimal places. In fact, if we carried out the calculation to 6 
decimal places, the Simpson’s Rule estimate is 0.746826 and the actual value of the integral is 0.746824. 0 


Problem 5.53 shows the usefulness of the various approximation methods. The 
. 1 
integral Ek e~ dx is not directly computable: no integration technique will allow 


us to get an exact answer for this integral. So we are forced to use some sort of 
approximation method. Many functions that are very useful to mathematics and 
engineering are in the same boat: their integrals are are dota but uncomputable 
except by numerical approximation. 


EXERCISES 


me Ss 
5.5.1 Compute (to 4 decimal places) the different estimates on the integral { = dt, using n = 4 (and m = 8 
1 


for Simpson’s Rule). 


REVIEW PROBLEMS 
5.54 Compute the following integrals: 


(a) Tas (b) [rian © f x2 Vx3 + dx 
i232 0 
x2 n/3 
(d) yaa (e) ., tan 0d@ (f) [2 —dx 
(g) [ #sinxar (h) [ vi=eax (i) (34 dx 


() 


1 
Pa TW fo () ree 
Hints: (c) 259 (d) 109 (e) 275, 179 (f) 77 (g) 100 (h) 81 (j) 204, 297, 36 (k) 233 


5.55 Let f be continuous on [a,b], and let c be a nonzero real number. 


(a) Show that a f(x)dx = fe z f(x —c) dx. 


b 
(b) Show that {" flayde = > of f(=) de. 


5.56 Let f(x) be an antiderivative of e~. (This function is very important to statistics, but is not expressible in 
terms of “nice” functions that we know.) Determine 7. x2e~ dx in terms of f(x). 


5.57 Find a general formula for - e* sin(bx) dx, where a and b are real numbers. Hints: 92, 13 


1 2 
5.58 Suppose f(x) is an even function (so that f(—x) = f(x)) such that A f(x)dx = 8 and { f(x)dx = 12. Find 
-1 iy 


2 
i f(x) dx. Hints: 8 
1 


183 


CHAPTER 5. INTEGRATION 


5.59 The horizontal line y = c intersects the curve y = 2x — 3x° in the first y 
quadrant as shown at right. Find c so that the areas of the two shaded regions 
are equal. (Source: Putnam) Hints: 301 


y=e 


5.60 Let u and v be differentiable functions. Determine an expression for 


5.61 Write a definite integral for the length of one period of the graph of the sine function. (Don’t try to evaluate 
the integral: it’s not possible to evaluate it in terms of elementary functions.) 


5.62 Find the volumes of the following: 

(a) The region enclosed by the surface resulting when the curve y = x* — x on [1,2] is rotated about the x-axis. 
(b) The solid consisting of the region between y = cos x and the x-axis along [o, x, rotated about the y-axis. 
(c) A truncated cone with height 6, whose top is a circle of radius 2 and whose bottom is a circle of radius 5. 


5.63 Consider a sphere of radius 2 centered at the origin. Find the volume of the portion of the sphere lying 
between the planes x = 0 and x = 1. Hints: 167 


5.64 Find a formula (in terms of a definite integral) for the surface area of the volume of revolution (about the 
x-axis) of the graph y = f(x) along the interval [a,b]. (As a test, the surface area of a sphere of radius r is 47”, so 
your formula should work with the function f(x) = Vr? — x2 on [-r,r] to give 4nr?.) Hints: 31 

5.65 Let f(x) be a cubic polynomial and let a > 0 bea positive real number. Show that the average value of f(x) on 
the interval [—a,a] can be computed by taking the average of f (5) and f (-4} (Source: Putnam) Hints: 291, 63 


CHALLENGE PROBLEMS 


5.66 The following integrals are all from the first two rounds of the 2007 MIT Integration Bee (see description on 
page 147). Try them for yourself! 


(a) { (2logx + (log x)*)dx Hints: 102, 286 (b) ae dx Hints: 290 

() - sin(V/x) dx Hints: 197 (d) f a - jax Hints: 198, 94 

5.67 Compute the following definite integrals: 

(a) cf as dx Hints: 256 (b) f . ee dx Hints: 108, 238 


7 x9 + 3x2 4x 


(Source: (a) Putnam, (b) Rice, (c) Texas A&M, (d) HMMT) 


: 2 
(c) { (77 pea = 27 a Hints: 157, 46, 177 (d) { mth Hints: 300, 1 
. 1 


5.68 Find a general formula for { SRE... SE where a and b are real numbers. 
(x — a)(x — b) 


5.69 Compute the length of the curve y = log(sin x) between the points (4, log ¥) and (3, 0). Hints: 292, 101 
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5.70 The hyperbolic cosine function is given by 


e+e* 
cosh x = 7 
Find the length of the graph of y = cosh x from x = 0 to x = 2. Hints: 236 


5.71 Recall that the error function is defined by 


erf(x) = = te e* dt. 


Compute the following (your answer may be expressed in terms of erf(x)): 


(a) a x e-5 dt (b) Ms fy ePdt (©) £ ( yerfix)) (d) fi erf(x)dx Hints: 79 


5.72x Suppose that f is a monotonic continuous function defined on a closed interval [a,b]. Prove that f fis 
defined. (Hint: show that for any e > 0, we can find a partition P such that u(f,P) — 1(f,P) < €.) Hints: 83, 241 
5.73x Find all continuous positive functions f(x), for 0 < x < 1, such that 


1 1 1 
[ ferar=1, ff foyer =a, ff foot ar = 0? 


where a is a given real number. (Source: Putnam) Hints: 45, 130, 105 


5.74x Find the volume of the intersection of two infinitely-long cylinders of radius 1, whose axes intersect at a 
right angle. Hints: 73, 190, 273 


5.75x Make a reasonable definition of a 4-dimensional unit sphere (of radius 1), and compute its volume. 
Hints: 11, 144, 178 


5.A FORMAL DEFINITIONS OF LOG AND EXP 


Throughout the book, we have been forced to have imprecise “definitions” of the exponential and natural 
logarithm functions. Now that we have defined the definite integral and proved the Fundamental Theorem of 
Calculus, we are finally able to make rigorous definitions of these functions. We start by defining the natural 
logarithm function: 


Wee ae era 


Note that the Fundamental Theorem of Calculus tells us that 4 log x = <, 


We can prove all of the nice properties of the natural logarithm that we were forced to assume earlier in the 
book. In particular: 
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Solution for Problem 5.76: 


1 1 
(a) We just apply the definition: log 1 = ‘3 ; dt = 0, since i f =0 for any function f. 

1 1 
(b) Pretend that b is constant, and we'll differentiate the function log ab with respect to a, using the Chain Rule: 


d bd 1 1 
7g 089 = ow x ag) = an ye al 
Thus, we conclude that log a and log ab, thought of as functions of a, have the same derivative, and thus must 
differ by a constant. That is, 

log ab = loga+C 


for some constant C. But plugging in a = 1 gives log b = log 1+C = C,so C = logb, and we have reached our 
conclusion that 
log ab = loga + log b. 
(c) We simply rearrange part (a): 
a 
b 
and then subtracting log b from both sides gives the desired result. 
(d) First, assume n > 0. Then the result is an easy consequence of part (a): 


loga = log(b- )- log b + log - 


loga” = log|a-a----- | = (loga) + (log a) + --- + (loga) = nloga. 
—_—_——— SS 
n times n times 


If n = 0, then the result is just log a® = log 1 = 0 = 0- loga. Finally, if n < 0, we use part (b): 
7 iv 1 
loga” = log (-) = —nlog mike —n(log 1 — loga) = nloga. 


oO 


We now would like to be able to take the inverse function of the logarithm and get the exponential function. 
Since } > 0 for all t > 0, we know that log x = FF } dt is a strictly increasing function, so it has an inverse whose 
domain is the range of log. By part (c) of Problem 5.76, we have loga” = nloga for any integer n and positive a, 
and the quantity n loga is clearly unbounded as n ranges over all the integers (for any a # 1). Thus, since log is 
continuous and has unbounded range, its range must be R, and hence log must have an inverse with domain R: 


Using the properties of inverse functions, we can prove the properties corresponding to Problem 5.76 for the 
exponential function: 
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Solution for Problem 5.77: All of these result from taking the exponential of the corresponding fact from Prob- 
lem 5.76. 


(a) Note that 
loge**® =a+b = loge’ + loge’ = loge’e’. 
Taking the exponential of both sides gives the desired result. 
(b) We do essentially the same as part (a): 


loge’ =a—b = loge’ — loge’ = log a 


Taking the exponential of both sides gives the desired result. 
(c) Wehave 
loge” = an = nlogé’ = log(e*)", 
and again, taking the exponential of both sides gives the desired result. 


We can also easily show that exp is its own derivative: 


Solution for Problem 5.78: We use the Inverse Function Rule for derivatives: 


ly -_ 1 
PO = FRG’ 
where here f(x) = log x, so that f’(x) = 4, and f~1(x) = e*. This gives 
d 1 
ae z=" #, 


as desired. 0 


We can further define exponential functions with any positve base: 


Note that this definition even works for a = e, since log e = 1. We quickly see that 


Qg= Ologa — 60 4 


and 


a = e844. 
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We can also show that a* has all the properties that we expect for exponentials: 


Problem 5.79: bandas Gkcteeremetcs and b,c € IR. Show that: 
(a) a® = (ay 


(b) abt = abate 


Solution for Problem 5.79: We simply use the definition of a* and apply the properties that we know about the 
function e*. 


(a) It’s easier if we start from the right side and write, by definition, 
(a’)° = et loga’ | 
But again we can apply the definition and get 
(a’) = gtloga’ _ ef loge”s" 
Using the fact that log and exp are inverses gives 
(a’) = efloga — ef loge?'s" = ettbloga) 
And now we just apply the definition one final time: 


(a’)’ = et loga’ — =. ef loge”res _ et(bloga) — pbcloga _ pbc 


(b) This is easier than part (a): 


qbté = elbtc)loga as eb logatcloga ai (e? 984) (¢° 182) =i ava’. 


Oo 


We could go on and define log, x as the inverse of a*, and prove its properties, and compute the derivatives of 
a* and log, x, but you get the idea. The point is that the definitions of exponentials and logarithms can be made 
rigorous and all of their relevant properties can be proved. So we weren't really cheating in Chapters 1-4 when 
we used these properties. 


5.B Simpson’s RULE 


Simpson’s Rule is an approximation technique used to estimate | f, in which we form a Riemann sum of 


a 
areas under parabolas. As with the other approximation methods that we studied in Section 5.5, we first choose 


age * gn . . F b-a 
a positive integer n, and we partition [a, b] into n pieces of size =* 


B= Xo Sk: Xn. <ie* + <0, = OB, 
.[b-a 
where x; =a +i ee 


We will approximate the area under f on each interval [x;,xj+:] of the partition by constructing the parabola 
that passes through the same points as the graph of f at each end of the interval and at the midpoint of the interval. 
That is, for each 0 < i < n, we wish to find a quadratic q;(x) such that 


Xj + X41 


ate) = fe), q(=E*) = f(A), geen) = foxin). 
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Then, since quadratics are easy to integrate, we will approximate the integral by the sum of the integrals of the 
quadratic pieces: 


n=l xin 
s fax)dx~ )) | gilx)dx. (5.B.1) 
e i=0 ~%i 


Our next step is to investigate the integrals of these quadratics: 


Problem 5.80: Suppose that q(x) is a quadratic function, and c < d are real numbers. Let m = <<" be the 


midpoint of [c,d]. Compute e q(x) dx in terms of c, d, q(c), q(d), and q(m). 
c 


Solution for Problem 5.80: Since q is quadratic, we can write it as q(x) = apXx? + a,x + ay for some coefficients ag, 41,42. 
We can now compute: 


d 


[wx + a,x + ag) dx = (Fax + sar? + aor) 
c 
c 


= zee —C)+ sad —c*) + ao(d —c) 


Ey (2a2(d? + de + c) + 3ay(d +c) + 6a). (5.B.2) 


We want to write this expression in terms of q(c), q(d), and q(m). We have 
q(c) = arc? + ajc + ao, 
q(m) = pole + 2de +c”) + sald +c) +49, 
q(d) = ard? + ayd + ao. 
In particular, note that 
q(c) + 4q(m) + q(d) = ar(c? + d? + 2de +c? + d*) +a(c + 2d + 2c +d) +.a9(1 +441) = 2ap(d? + de +c”) + 3a;(d +c) + bao. 


This exactly matches the expression in parentheses in (5.B.2), so we conclude that 


f (apx? + a,x + a9) dx = a (2a2(d? + de + c*) + 3a;(d + c) +a) = “(ao + 4g(m) + q(d)). 


Using the result from Problem 5.80 in equation (5.B.1), where c = xj, d = Xj41, and m = (4). we get 


i=0 Vi 
: 5 ee (Fea) + af (= ee ) + feiss) 
i=0 
zs — y" ( f(x) + af (= aaa + f(xin)). (5.B.3) 
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As discussed in Section 5.5, we usually apply this approximation by constructing a new partition with m = 2n 
pieces 

A=X% <x, <%<-' <x, =), 
where the even terms are the terms of the original partition (so that x}, = x;) and the odd terms are the midpoints of 


the intervals in the original partition {so that x4... = at Aiet . Then the Simpson’s Rule approximation in (5.B.3) 
& P 2i+1 3 Pp PP 


becomes a 
* (Fx) + Af (x}) + 2f (x5) + 4f (x5) + 2f (x4) + -°* + 4L(X,_1) + F(%in))- 
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INFINITY 


Infinity is a concept that you probably have an intuitive feel for. Informally, some of the meanings of infinity are 
“larger than finite” and “grows without bound” and “arbitrarily large.” But what does “infinity” really mean? 
How do we rigorously work with infinity? 


In calculus, we have different meanings for infinity and for the infinity symbol co depending on the context. 
Sometimes co represents a number or quantity that grows arbitrarily large (and similarly —oo represents a number 
or quantity that grows arbitrarily negative). In other places, co will be a placeholder for a quantity that is 
unbounded—for example, the interval [2, co) that has no upper bound. 


In this chapter, we will explore different notions of co as they occur in calculus. 


6.1 LIMITS TOWARDS INFINITY 


Let’s recall our definition of limit: if f is a real-valued function, thenwe Y 
say that 


lim f(x) = L 
if, for all e > 0, there exists 6 > 0 such that 


0<|x-al<6 => ‘If(x%)-L]<e. 


Also recall the graphical representation of this, shown at right. In words, 
this says that we can get f(x) to be as close to L as we want by making 
x sufficiently close to a. Graphically, this means that given any pair of 
horizontal lines, we can find a sufficiently narrow set of vertical lines so that 
the graph of f between the vertical lines lies entirely inside the horizontal 
strip between the given horizontal lines. 


lim f(x) = L 
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How can we modify this concept to define 
‘ -1? 
A= 
We want to “make x arbitrarily close to infinity,” but what does that mean? What makes sense is that “x is 


arbitrarily close to infinity” means “x is arbitrarily large”—getting closer to infinity means getting larger and 
larger. This motivates our definition: 


Definition: Let f be a real-valued function. We say 


lim f(x) = L 


if, for all e > 0, there exists N such that 
x>N = ([f@-L<e. 


In words, this states that we can make f(x) arbitrarily close to L—meaning that we want | f(x) — L| < e for some 
given € > 0—by restricting x to be “arbitrarily close” to coo—meaning that we have x > N for some large N. 


Concept: _In the usual limit definition, we have the condition 0 < |x — al < 6, signifying that 
x is sufficiently close to a. In the definition of a limit towards infinity, we have the 
condition x > N, signifying that x is sufficiently close to oo. 


We can also see what this looks like graphically. In the picture to the right, Y 
we see that all values of x > N make the function be within e€ of our limit L. 
If we make € smaller, as in the second picture, we can move N to the right to N 
still “squeeze” the function within e of L: 


The definition of jim, f(x) is similar, except now we have x “arbitrarily slide 
close” to —co, meaning that x < N for some N: L-e 

Definition: Let f be a real-valued function. We say 

: A x 
Jim, f(a) =1 
y 
if, for all e > 0, there exists N such that 
x<N => (|f()-H<e. N 

L+e 


As with limits at positive infinity, the condition x < N in the definition 
above is interpreted to mean that x is arbitrarily close to —co. We will leave it L-e 
to you to draw the corresponding picture. 


We also use the following terminology that describes the graph y = f(x): 


Definition: We say that the graph of a function f has a horizontal asymptote at L if 


jim f@=L or im f(x) =L. 
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This means that the graph of f becomes arbitrarily close to the graph of y = L as x gets arbitrarily large (or 
arbitrarily negative). 


Let’s practice using the definition of a limit at infinity. 


Problem 6.1: Find lim .. 


Solution for Problem 6.1: Let f(x) = 4. It seems that the limit should be 0, as the value of f(x) gets closer and closer 
to 0 as x gets larger and larger. But how do we prove it? 


As usual, in limit-definition computations, we start with a given e€ > 0, and we need to find a corresponding 
value of N so that the definition is satisfied. Specifically, we want N such that 


<= 


x>N => ([fx)-O<e = E 


But || <e€ © |x| > 4,so we can choose N = 3. Then, reversing the argument, we note that if x > N, then } < § 
and 


=€. 


1 1 
ire -01= feo =[2| <]7 
Thus, x > N implies | f(x) — 0| < €, and hence, by definition, jim f(x) =0.0 


One thing that should not surprise you is that limits at infinity satisfy all the same nice algebraic properties 
that other limits do: 


lim (f + g)(2) = lim f(x) + lim g(x), 


lim (f)(2) = (lim fC)(lim go), 


lim f(x) 
lim (f/g)(x) = wee (if lim g(x) #0), 


lim (cf)(x) = c(lim f(x)), 


provided all of the above limits exist. The same properties also hold for limits towards —co, We will leave the 
proofs of these properties as exercises. 


Next, we study the behavior approaching infinity of rational functions: functions that are quotients of 
polynomial functions. 


Solution for Problem 6.2: You may already intuitively believe that the answer should be 3, because the x° terms 
“dominate” the others as x grows arbitrarily large. But how can we more rigorously see this? 


Let's call the function f(x) and divide numerator and denominator by x*: 


4 1 
fa) = — ETE 
“3 z—J4 2 
+% rts 


Now it is clear what is happening: when x — ov, all of the terms with powers of x in the denominator will 
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approach 0. So we are left with just 3. Hence 


2e—-4x+1 2 


En Siete eee OH 


We can see the horizontal asymptote at 3 by drawing the graph of y 
_ 28 - 4x41 
4* 3x3 422 —x +3" J 


shown at right. As x grows large, we see that this graph approaches the line y = 3; this the graphical representation 


of 
2e-4x+1 2 


BT 3x3 +22 —x+3 3 
This sort of thing generalizes to any rational function: 


f@) 


Problem 6.3: Describe how to find the limit as x — oo of h(x) = Pr aia 


How does this limit depend on the degrees of f and g? 


, where f(x) and g(x) are polynomials. 


Solution for Problem 6.3: For notational sake, let’s suppose f has degree m and g has degree n. So we write 


AmX™ + Am 1X" + +++ + a9 
yx" + Dy yx") +--+ +d ’ 


h(x) = 
where 4, # 0 and b, # 0. 
In keeping with our technique for Problem 6.2, we divide by the highest power of x present in either polynomial. 
This leads to 3 cases: 


Case 1: m > n. If the polynomial in the numerator is higher degree than the polynomial in the denominator, 

we get, after division by x”: 
= Oe ee Os 
h(x) = ee 

get ge t+ 3B 
We see that as x — oo, the numerator approaches 4, which is nonzero, but the denominator approaches 0. So 
the fraction as a whole becomes arbitrarily large (positive if #* > 0, negative if  < 0), since the numerator 
approaches a, while the denominator becomes arbitrarily email: Therefore the limit” does not exist. Although, in 
this situation, we sometimes write 


lim h(x) = 00 Or —0n, 
which we will say more about in Section 6.2. 
Case 2: m < n. We again divide by the highest power of x, which in this case is x”: 
is + oer tt: +2 


bo 
xn 


h(x) = 


By + ME pene + 

Now the numerator approaches 0 while the denominator approaches b,,. So the limit is 0/b, = 0. 
Case 3: m = n. This is the case in our earlier example from Problem 6.2. We divide by x”: 

Om + Od +--+ B 


h(x) = 
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As x approaches infinity, all the terms with powers of x in the denominator approach 0. So we have 


lim h(x) = =a. 
x00 Bs 
To summarize: 


e If the numerator has higher degree, the rational function grows without bound. 
e If the denominator has higher degree, the rational function has a horizontal asymptote at 0. 


e If they have the same degree, the rational function has a horizontal asymptote equal to the ratio of the 
leading coefficients. 


Limits as x approaches oo are not all that mysterious, and can be restated in terms of a finite limit: 


Problem 6.4: Suppose f is a function whose domain includes (0, 00). Show that 


lim f(a) = tim F (2), 


provided either limit is defined. 


Solution for Problem 6.4: Intuitively, we see that x grows arbitrarily large as 4 > 0 gets arbitrarily close to 0, and 
vice versa, so the result seems plausible. The proof is a matter of plowing through the 6-e definitions. 


First, suppose that lim f(x) = L, and let e > 0 be given. We know that we can choose N > 0 such that 
x>N => ([f(x)-L<e. 


Let 5 = 4. Then 
0<z<6 = ~>N > IF(=)- 


<€. 
But this last statement is exactly the definition of lim ih (=) =L. 
z—0* 
Conversely, suppose that lim ti () = L, and let € > 0 be given. We know that we can choose 6 > 0 such that 
z—0* 
1 
0<z<6 = IF(2)-4 <e. 


Let N = + Then 


x>N => <= <6 > i (z}-» <e => If(x)-LI<e. 


i 
x 


This is the definition of jim f(2).= L. 
Therefore, the two limits are equal (provided they are defined). 0 
A statement similar to that of Problem 6.4 exists for limits as x approaches —0o; we will leave this as an exercise. 


There is a special case of a function f in which we know that lim f(x) must exist. If f is an increasing 
x—0o 
function and f(x) is bounded above, then f cannot continue increasing forever, but must approach some value as 
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x approaches oo. Not surprisingly, the value that f must approach is the least upper bound of f(x). We can make 
this statement more precise: 


Problem 6.5: Suppose that f is a monotonically increasing function and f(x) is bounded above—that is, there 
exists some M such that f(x) < M for all x. Show that iim f(x) exists and is equal to sup Rng(f), the least upper 


bound of the range of f. 


Solution for Problem 6.5: Let L = sup Rng(f): we know that L must exist since Rng(f) is a subset of IR that has an 
upper bound (namely M), so by the axioms of the real numbers (from Chapter 1), Rng(f) must have a least upper 
bound. We wish to show that jim f(x) = L. This means, by definition, that for any e > 0, we must find N > 0 such 


that 
x>N i= ([f@)-L| <e. 


Because L is an upper bound, we have f(x) — L < 0 for all x € Dom(f), so the above statement is equivalent to 
x>N = f(x)>L-e. 


We simply choose any N such that f(N) > L —e¢. Such an N must exist, because if it doesn’t, then L — € is an upper 
bound for f, contradicting the fact that L is the least upper bound for f. Then, because f is increasing, we have 
f(x) = f(N) > L—e for any x > N, as desired. 0 


Essentially the same argument shows that if f is monotonically decreasing and is bounded below, then 


lim f(x) = infRng(f). 


EXERCISES 
6.1.1 Compute the following limits: 


_ 5xe— x27 +3x-2 , 2x4 + x? -3 _. x+sinx ., 
© Mimerae—2+6 =) MBs eeax-7 =) BR arsine inte? 
6.1.2 Suppose f, g are functions with lim f =Land jim g =M. Prove that 


(a) lim(f +g)=L+M (b) lim (cf) = cL for any c€ R 


6.1.3 Show that lim f(x) = lim (=) Hints: 142 
——0o z-0- 


6.1.4x If f(x) is differentiable and lim f(x) =c, then what can we say about lim f'(x)? Hints: 121, 43, 251 


6.2 LIMITS OF INFINITY 


In Section 6.1, we discussed what it means for there to be a limit of f(x) as x approaches oo or —oo. Here, we 
discuss the related question of what it means for the value of the function to approach co; that is, the meaning of 


lim f(x) = 00, 


x—-a 


for some a € R. Informally, this should mean that as x gets close to a, the value of f(x) approaches co, meaning the 
value of f(x) grows arbitrarily large. But of course we want to define this rigorously. 
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In words, this means that we can make f(x) become arbitrarily close to co— 
that is, we can require f(x) > N for any given N—by choosing x sufficiently close a-d));a+6 
to a—that is, by choosing 6 such that 0 < |x — al < 6. 


We can also represent this definition graphically, as in the picture to the right. 
Given any value of N, we can find a value of 6 so that the entire graph between 
x =a-—6and x =a+6 lies above y = N. If we increase N, then we can always 
find a smaller 6, as shown in the lower picture. Everything between the vertical 


<= 


lines at x = a— 6 and x = a + 6 lies above the horizontal line y = N. : 
There’s a similar definition for 
x 
lim f(x) = —09, ¥ 
— a—dyyatd 
but we will leave the statement of this definition as an exercise. 
N 


Let’s look at a quick example of the definition. - 


Solution for Problem 6.6: Let f(x) = Gpp- Intuitively, we see that as x gets close to 2, the denominator is positive 
but very close to 0, so the value of the f(x) gets very large. To show this rigorously, we have to show that, given 
any N, we can find a 6 satisfying the definition. 


So, given N, we need to find 6 such that 


0<|x-2)<56 => N. 


a=” 
Note that if N < 0, then any 6 will work, since f(x) is positive-valued. Thus we can assume that N > 0. The 
1 it 


At 1 go rat tas 
condition eran Seer ata 2) < jy Sox 2| < Nay 


So we simply choose 6 = ee, and we’re done. 0 


VN 


We can also define one-sided limits that approach oo or —oo, in the obvious way that is analogous to one-sided 
finite limits: 
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| Definition: Let f be a real-valued function and a € R. 
° jim f(x) = co means that for all N, there exists 6 > 0 such that 


O0<x-a<6 = f(x)>N. 
° jim f(x) = 0 means that for all N, there exists 6 > 0 such that 


O<a-x<6 => f@)>N. 


i 
These one-sided limits are useful for a function such as f(x) = —, shown to the Y 
right. The graph of this function approaches the line x = 0, so we want to think of 
this function as having a vertical asymptote at x = 0. But lim = is undefined, since 


x 
the function grows without bound as x approaches 0 from the right, but decreases 


without bound as x approaches 0 from the left. Instead, we can write an 
ve a card Sh 
lim — = co and lim — = —oo. 
2=90* X x0" X 


cgeodk : 
Again, note that lim po does not exist. 
Bb soe 


This example motivates our definition of vertical asymptote: 


Definition: We say that the graph of a function f has a vertical asymptote atx =aif 


lim f(x) = +00 or jim f (x) = +00. 


x—at 


(Note that +co means that the limit is co or —co.) 


Geometrically, this means that the graph of f approaches the line x = a as x approaches a from one side. 


In Problem 6.3, we determined the horizontal asymptote (if any) of a rational function. Next, we examine the 
vertical asymptotes of such a function: 


Problem 6.7: What are the vertical asymptotes of h(x) = oat where f and g are polynomials? 


Solution for Problem 6.7: Clearly, if a is such that g(a) # 0, then x = a cannot be a vertical asymptote, since then 


in £02) F) 


—~ = € Ik. 
xa g(x) g(a) 
We might we tempted to “conclude” that: 


Bogus Solution: _ If ris a root of g(x) (so that g(r) = 0), then there will be a vertical asymptote 


exer 


But this is not necessarily the case. For example, the function 
x*-1 


ne = x-1 
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does not have a vertical asymptote at x = 1, since 


lim h(x) = lim(x + 1) = 2. 
x1 x1 


It is true that if r is a root of g (the denominator of the rational function) but not of f (the numerator of the 
function), then h = f/g will have a vertical asymptote at x = r. However, if r is a root of the numerator and 
denominator, then we divide both by the common factor (x — r) and recompute. 0 


EXERCISES 

6.2.1 

(a) Write a definition of lim f(x) = —00. 
(b) Write a definition of jim f(x) =. 
6.2.2 Compute the following limits: 


; 1 : x-1 
ie i ed (b) 


(a) 


Ss (a lim tanx 
xo1- 38 — x2 4+x-1 () x57 


6.2.3 Find a continuous function f such that jim f(x) is undefined (in particular, this limit is neither a for any 
a € Rnor +o). Hints: 293 


6.3. RATIONAL INDETERMINATE FORMS AND L’HOPITAL’s RULE 


In Problems 6.3 and 6.7, we determined the asymptotes of rational functions—functions that are quotients of 
polynomials. We’d like to be able to study more generally functions of the form &, where f and g are arbitrary 
functions, and in particular describe their behavior as x approaches oo or —oo. 


For example, you probably already believe that exponentials “dominate” polynomials. This means that 
pS) 
a 


for any polynomial function f. However, why do we believe this? We have an idea that exponentials “grow 
faster” than polynomials, but how do we make this limit computation precise? 


Let us first examine a related question that is slightly more tractable. Suppose f and g are differentiable 
functions such that f(a) = g(a) = 0 at some point a. How can we compute 


We immediately see the difficulty: although the function (f/g) is continuous where it is defined, it is not defined 
at x =a, since g(a) = 0. So we can’t just “plug in” x = a to compute the limit. 
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But the key here is that the functions are not just continuous—they are differentiable, and that gives us an 
additional tool to work with—specifically, we have the tool of tangent line approximation. See if you can work it 
out from here: 


Solution for Problem 6.8: The tangent line approximation of f at x = a is 
f(x) = f@) + f'@e-4), 
and similarly the tangent line approximation of g at x = a is 
g(x) * g(a) + g’(a)(x — a). 


This gives us the approximation 
fx) fa) + f'@e-a) 
= g(x) os g(a) + g’(a)(x — a) 
But recall that f(a) = g(a) = 0. So we’re left with 
fe) fay(x-a) 
Pt g(a) ~ et eae a) 
If x # a, we can cancel the x — a terms and we have 


fa) fla) _ f°) 
2B 9a) ~ 2 @) ~ g@)’ 


The above argument is only informal and is not rigorous, because we approximated the limit with the tangent 
line approximations of f and g. Specifically, we’d like to get rid of that “~” symbol and say 


ee eee 
lim —— = —. 
xa g(x) g(a) 
It turns out this is OK to do. Recall that in Problem 4.22, we defined 


E(x) = f(x) - (F@) + f'@(x - 4) 


Es(x. 
to be the error of the linear approximation of f, and we also showed that lim on = 0. Thus, defining E,(x) 
similarly as the error of the linear approximation of g, we have 


tim £ = jm fOtLOO-OtEO | fO+G _f£@ 
0 B®) x8 ga) + X—AFE@) 8 (qy4 BO g(a) 


x 


Putting this all together gives us: 
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Important: _ Basic version of l’H6pital’s Rule: If f,g are differentiable functions with 
Vv f(a) = g(a) = 0, and g’(a) # 0, then 


£2) _ £'@ 
xa g(x) g’(a) 


In fact, l‘H6pital’s Rule works with g’(a) = 0 too! But then the quotient of the derivatives becomes a limit too. 
We have: 


Important: —_ More general version of l’H6pital’s Rule: If f, g are differentiable functions with 
TP tensor tine 
Pi RMR a 


xa Q(X) x90 Q! (x) 


In this more general version, if f’(a) = g’(a) = 0, then we can apply |’H6pital’s Rule again if necessary. We will 
show the details of the proof in Section 6.A. 


Here’s a basic example of l’H6pital’s Rule in practice: 


Problem 6.9: Compute lim “2, NN 


Solution for Problem 6.9: We’ve seen this limit before, but having l’H6pital’s Rule in the toolbox makes it easy. 


If we let f(x) = sinx and g(x) = x, then f(0) = g(0) = 0. L’H6pital’s Rule says that we can evaluate oe to find 
the limit. 


We see that f’(x) = cos x, so f’(0) = 1, and g’(x) = 1, so g’(0) = 1. Thus, the limit is f= = 


A function a where f(a) = 9(a) = 0 is called a indeterminate form at x = a. We can use l’H6pital’s Rule to 


compute limits that are $ indeterminate forms. Let’s do one more example: 


eo —3t-1 
Problem 6.10: Compute lim ET 


Solution for Problem 6.10: The numerator and denominator are 0 at t = 0, so this limit is a 9 indeterminate form, 
and we can use I’H6pital’s Rule by taking the derivative of both the numerator and the denominator: 


ii et —3t-1 ate 3e* — 
fea 4t2 40 a 


We try to compute the limit by plugging in t = 0, but we see that this is still a ¢ indeterminate form at t = 0. No 
problem, we can use |’H6pital’s Rule again: 


_ ef —3t-1 |, 3e#-3 | ett 
he. Sea, Ce 


Now we plug in f = 0, and see that the limit is 3. 0 


A very nice feature of l’H6pital’s Rule is that it is flexible, so that it can be used in many situations other than 
° indeterminate forms. One such situation is to compute a limit of a quotient of functions towards oo, 
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Problem 6.11: Suppose f, g are differentiable functions with 


lim f(x) = lim g(x) = 0. 


Use the substitution z = 4 to show that I’H6pital’s Rule applies “at infinity”; that is, 
of Fe) 


gee Fay 


provided this last limit is defined. 


Solution for Problem 6.11: We use the result of Problem 6.4: if we substitute z = 4, then 


1 
ite & tim fl) 
x00 g(x) z—0+ g(t) 


But jim f () = jim f(x) = 0, and similarly jim (=) = lim g(x) = 0. So I’H6pital’s Rule applies, and we have 
1 ag(l pe ra eh “(1 
tig = iy SA tg EO — i 
s(2) =" gs(:) as'(:) ** s'(2) 


Putting this all together gives 


Sa 
. ame 
Nie 
~~ 

Pay 

2 
= 
Nie 
wee” 


- = lim fe 
x0 9(x) 20 g(2) pan g’ (2) xv Q(x)’ 


as desired. 0 


Of course, essentially the same argument works for limits as x approaches —co; we will leave the details as an 
exercise. 


Yet another flavor of l’H6pital’s Rule deals with infinite limits: 


Important: _L’H6pital’s Rule for infinite limits: if f, ¢ are differentiable functions with 


lim f(x) = lim g(x) = +00, 


5 ha: Fein) 
Pt ge) 9G)’ 
provided this last limit is defined. This also works as x approaches +oo: if 


lim_ f(x) = lim g(x) = +00, 


X—t00 


then 


oli ea 


X00 g(x)  b:k00 g(x)’ 


These statements are considerably harder to prove, so we will leave the proofs to section 6.A. Not surprisingly, 
a function a where lim f(x) = lim g(x) = © is called an = indeterminate form at x = a. We can use l’H6pital’s 
x—a x—a 
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Rule to compute limits at = indeterminate forms, and in particular we can use it to address the question posed at 
the beginning of this section: 


e 


Problem 6.12: Suppose f is a polynomial function. Compute | lim —— 


Solution for Problem 6.12: We know that lim f(x) = +co for any non-constant polynomial f, and also lim e* = oo. 
x- 00 x00 
Thus Ae is an © indeterminate form as x approaches oo, and we can apply l’Hépital’s Rule, noting that £e* = e*: 


tig 4 = tin 


This helps because the degree of f’ is one less than the degree of f. If f’ is non-constant, then this is still an = 
indeterminate form, and we can continue to apply l’H6pital’s Rule. The denominator will stay e*, but the degree 
of the numerator will decrease by 1 again. Repeating this process will eventually lead to a polynomial with degree 
0 in the numerator, which is a constant. Specifically, if deg f = n, then 


f') f@) 


lim 1 
lim ex = lim ex Sane Ia = lim <, 


for some constant c. But this limit is 0, since the denominator grows without bound while the numerator is 
constant. Thus 
x 
lim f@) 


x00 et 


= 0. 


Problem 6.12 is a special case of a more general property that we examine for a quotient of functions: 


Definition: We say that a function g dominates a function f if 


fe). 
lim 2 g(x) 


Informally, this means that g(x) “grows faster” than f(x) as x grows arbitrary large. In our previous examples, 
we have seen that higher degree polynomials dominate lower degree polynomials, and that exponential functions 
dominate all polynomials. As an exercise, you will show that all polynomials dominate the sine and cosine 
functions. 


We often have to be a little careful using l’H6pital’s Rule, as in the following example: 


Solution for Problem 6.13: This function is an = indeterminate form. So we use l’H6pital’s Rule to compute the 
limit. We take the derivatives of numerator and denominator: 


im 2 Sinx im Lt 608% 
zo Qe+3 rece 2 


However, this latter limit is undefined, since cosine oscillates and does not approach any value. This might lead 
to the wrong conclusion: 
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No! This just means that we can’t apply I’H6pital’s Rule to compute the original limit. Then, how do we 
calculate the original limit? We do it in the way that we first computed limits at co for rational functions, by 
dividing by the highest power of x: 

CMT Sd ove 


x20 2x+3 x—00 2+3 . 


Now the answer is clear: the two terms with x in the denominator approach 0 as x approaches oo, and thus the 
limit is }. 0 


EXERCISES 
6.3.1 Compute the following limits: 


.. 1-cosx .. sin3x J. . log x 
A waco OF) chtbatiin tO ener, aay he SR Vena 


6.3.2 Show that if f is a non-constant polynomial, then f dominates the sine and cosine functions. Hints: 176 


6.3.3 Show that any non-constant polynomial dominates the function f(x) = x. Hints: 196 


6.3.4 If we apply l’H6pital’s Rule to 
lim E+") =f) 
h-0 h 


what happens? 


6.3.5 Show that l’H6pital’s Rule works for , indeterminate forms as x — —oo. 


6.4 ExPONENTIAL INDETERMINATE FORMS 


As we saw in the last section, l'H6pital’s Rule is a very useful tool for computing limits of quotients of functions 
that produce indeterminate forms. But l’H6pital’s Rule has even more powerful uses than in the examples we 
have already seen. 


Our motivation is the following classic limit: 
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Solution for Problem 6.14: At first glance, this doesn’t really look like a limit for which we can use l’H6pital’s Rule. 
But taking the logarithm is the key to converting it into a more traditional indeterminate form. 


(a) Welet y= (1 + 1)", Then 


log y = xlog(1 + ~). 


But this can be rewritten as a quotient of functions: 


Now we see that as x — 09, this limit goes to Q, so we havea 3 indeterminate form. 


(b) We can apply l’H6pital’s Rule: 
1 


1 
1+s 


as 


1 
-3) 
lim log y = lim 


yoo x—-00 —_ 


This looks ugly, but the -+ terms cancel, and we have 
, , 1h 
lim log y = lim (1 + :) =f 


(c) Wecan undo the logarithm by taking the exponential of both sides of the above equation, and this preserves 
the limit because the exponential function is continuous. Thus, we have 


lim y =e! =e, 
feed 
and hence 
ti nd 
lim (1 + -) =e 


Problem 6.14 is an example that shows that l’H6pital’s Rule is useful for more than just § and © indeterminate 
forms. Let’s look at a more complicated example: 


Problem 6.15: Compute lim x*"*. 
x0" ) 


Solution for Problem 6.15: We write f(x) = x‘"*. Taking the logarithm of both sides, we get 
log f(x) = log (x#*) = (sin x)(log x). 


205 


CHAPTER 6. INFINITY 


We'd like this in a form to which we can apply l’H6pital’s Rule, so write this as 
log x 
log f() = So 
Now it is an © indeterminate form as x — 0*, so we apply l’H6pital’s Rule: 


logx _ ‘ _ sin? 


cscx x0 —cscxcotx x30 xcosx 


disp log fla) = Jim 
This is now a 8 indeterminate form, so we apply l’Hépital’s Rule again: 
. _ | ee six cosx 
sytlog fan) = — BS cosa sain” 
At x = 0, the numerator of the last expression is 0 and the denominator is 1, so the limit is 0. Thus, 


lim log f(x) = 0, 


x—0* 


and hence lim f(x) = e° = 1. Oo 


EXERCISES 
6.4.1 Compute the following limits: 


acai a 
(a) Jim xlog x (b) lim xsin(—) (c) lim x sin(l-x)/ Hints: 61, 50 (Source: HMMT) 
6.4.2 Show that e = lim(1+x)?. 

x—0* 


6.4.3 Compute lim (1+ kx)?, where k > 0 is a positive constant. Hints: 125 


6.4.4 Suppose f and g are functions such that lim f(x) = jim g(x) = co. Explain how we can use l’H6pital’s Rule 
as a tool to compute jim (f(x) — 9(x)). Hints: 161, 156, 162 


206 


6.5. IMPROPER INTEGRALS 


6.5 IMPROPER INTEGRALS 


As we've seen, we use the definite integral L f to compute the area of the region under the graph of y = f(x) 
along the interval [a,b]. By definition, these integrals can only be used to compute areas of bounded regions. 
In some situations, however, we are interested in unbounded regions—these are regions that extend “towards 
infinity” in at least one direction. Yet, many unbounded regions still have finite area. 


We start with a basic example of this phenomenon: 


Solution for Problem 6.16: We sketch a picture of this region at right. Notice that Y 
this region is unbounded: the region extends towards +00 as x grows large. 
Even though this region is unbounded, we can attempt to determine its area. We 
certainly can compute the area of the portion of the region to the left of x = b (for 
any b > 1) as the definite integral 


1 
" Pe dx. 
As b grows larger, we expect that the area under the curve on [1,b] approaches 
the area of the entire region under the curve on [1, +00). Specifically, this area is 


a 
= ax. 
boo Jy x 


The integral is easy to evaluate: 


Thus, when we take the limit, we get that the area of the region is 


dx = jim (1-5)=1-0=1. 


1 
lim 7) 

boo Jy xX 
Oo 


Note the “paradox” here: even though the region is unbounded, it has finite area. Problem 6.16 suggests a 
logical definition: 


There’s an obviously similar definition for improper integrals in the other direction: 
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Definition: Let f be a continuous function and b € R such that (—co, b) C Dom(f). We define the improper 
integral 


e f)dx = lim, fe fa) dx, 


provided the limit is defined. If the limit is defined and is not +co, we say that the improper integral converges. 
Otherwise, we say that the improper integral diverges. 


Let’s generalize Problem 6.16: 


Problem 6.17: Let r be a real number. Compute 


Solution for Problem 6.17: By definition, we compute the improper integral by writing a limit. If r # 1, then we 


have: 
b 


lim = dx = lim etiin sade 
b-00 1 et b-0co (r —1)x?-! 


: 1 1 
eat as =e 


If r > 1, then the term ra approaches 0 as b approaches ov. Thus, in this case, the improper integral converges 
to 4. 
r=1 


1 
This equals 


If r < 1, then the term ;4+; grows without bound as b approaches oo. Thus, the integral diverges. We might also 


pr-1 
write 
| 1 
—dx=0oo iff <1 
1 x 


Our original integration was not valid for r = 1, so we have to do that case separately: 
b 

= lim(log b). 
1 b-0o0 


, 1 ; 
lim ™ at'= lim (log x) 


b-00 1 


As b goes towards infinity, this grows without bound, so the integral diverges. 


i i eae + ifr > 1, 
a diverges ifr<1. 


The next problem is another common example of an improper integral: 


Problem 6.18: Compute fe e™ dx, where a is a real number. 
0 ; 


Solution for Problem 6.18: We compute, for a # 0 (we'll investigate a = 0 at the end): 


In summary: 


ina fh dem Ben she — Da + imi 4), 
boo A a b->co 


bo Jg 
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If a is positive, then lim e” = 00, so the integral diverges. If a is negative, then lim e” = 0, so the integral equals 
-1. (Note this is a positive number when a is negative, so this answer makes sense.) Finally, if a = 0, then the 
integral is cy 1 dx, which clearly diverges. 


Thus, the integral diverges for nonnegative exponents, and converges for negative exponents. 0 


The result of Problem 6.18 is typically written as follows: if r > 0, then 


La ax = zs 
0 r 


Problem 6.19: Suppose f and g are continuous functions on [a, co) and f(x) < g(x) for all x > a. 
(a) Show that, if ft fand £ g both converge, then 


fife 


(b) Show that if both functions are positive, and 1M g converges, then os f converges. 
(c) Show that if both functions are positive, and as f diverges, then if g diverges. 


Solution for Problem 6.19: 
(a) For any b >a, we have (g — f)(x) > 0 for all x € [a,b], thus 


fe — f(x) dx > 0. 


1) f(x)dx < f: g(x) dx, 


and since limits preserve non-strict inequalities, we conclude that 


[ seode= im [peyaes tim [ geyde= [~ gate 


F(x) = ii f(t) dt. 


Note that F is an increasing function (since f(x) > 0 for all x > a), and that fo. f(x) dx = lim F(x), if this limit 
exists. Also, since 0 < f(x) < g(x) for all x > a, we have 


0 < F(x) = Ai é f(b dt < fe g(t) dt < ay o(t) dt. 


Thus F is increasing and has an upper bound (namely, ‘f g(t) dt, which by assumption converges), so by the 


result of Problem 6.5, the limit 
lim F(x) = f(t)dt 
00 a 


Therefore, 


(b) Define a function 


exists, so the integral converges. 
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(c) This is just the contrapositive statement to part (b), so there is nothing additional to prove. 
Oo 


Thus far in this section, we have looked at improper integrals that compute areas of regions that are unbounded 
in the x-direction. There is another type of improper integral that occurs when the region that we are examining 
is unbounded in the y-direction, as in the following example: 


Solution for Problem 6.20: Sketching the graph will immediately show the issue. We have Y 
lim eee co, So the area under y = aE is potentially infinite (and in fact the function is 
< 


x0* Vx 
not even defined at 0). 


We can do essentially the same thing we did for improper integrals with a limit of 
integration of too. We define 


hey 
—dx= — dx. 
{2 lim [ Vx 


Note the “0*”—since we only care about the interval (0,1], we only care about what 
happens to the right of 0. 


This integral is now easy to compute: 


1 
x] =2-2-Ve. 
Cc 


As c > 0*, this approaches 2. Hence 


Once again, a seemingly infinite area turns out to be finite. 0 
We can generalize the definition from Problem 6.20: 


Of course, we can do the same thing if the function has a limit of +co at the “b” end of [a,b). (We will omit 
writing out the formal definition.) 
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Sidenote: Note that the above definition is consistent with our usual (non-improper) in- 


tegrals. In particular, if f f is defined, then by the Fundamental Theorem of 


Calculus, the function 
g(x) = if f(t) at 


is differentiable, hence continuous, and thus 


i. f(t) dt = g(a) = lim g(x) = lim if f(t) dt. 


We know that for regular (not improper) integrals, we can break them apart at any point into two separate 


a a c 


This is also how we evaluate integrals that are improper at both ends, as in the following example: 


Problem 6.21: Compute if: t dx for all r > 0 (or determine when it diverges). 
0 


Solution for Problem 6.21: The correct thing to do with an integral that is improper at both ends is to split it 
somewhere in the middle. For example, we can write 


1 
[0 Saxe [Saxe [Sax 
o. x 0 x" 1% 


(We didn’t have to pick x = 1 as the point at which to split them, but it seems convenient since x’ is nicely behaved 
at x = 1.) We already know by Problem 6.17 that \ 4 dx converges if and only if r > 1. The other integral is 


1 
‘oe 1 
)= 4 tim (2-1). 


a 


1 1 

1 
< = ln = dee lien -——_—_ 
oa a—0+ J, Xx" a0+\ (r—1)x?1 


If r > 1, then the fraction gets arbitrarily large, so the limit is infinite. Thus t 4 dx diverges for r > 1. 


Hence our original doubly-improper integral is never convergent: the integral on (0, 1] diverges for r > 1, and 
the integral on [1, 00) diverges forr <1. 0 


Important: __ If (a,b) C Dom(f) and fi f(t) dt is improper at both ends of (a,b), then 
a 


, f()dt = lim if f(t) dt + lim i f(t) dt, 


for any c € (a,b). 


As noted in the solution to Problem 6.21, it doesn’t matter at which point we break up the doubly-improper 
integral. 
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Concept: We can break an integral apart as 


SAE a, 


at any c € (a,b) that we choose. Thus, choose c to be as convenient as possible. 


We will leave it as an exercise to prove this. Also, it is not correct to try to take a shortcut and deal with both 
ends of a double-improper integral at once. In particular: 


ia” hath Die f(x) dx is not the same as jim ce f(x) dx. 


The correct way to evaluate an integral over all of R is to choose c € R, and then compute 


[_ feorae= porde+s [~ sonar = jim [ feoar+ lim f° fooex 


We will leave it as an exercise to explore this further. 


We also have to be a bit cautious when dealing with functions with domains that are not all of R. Integrals of 
such functions might be improper but not immediately appear so. For example: 


Problem 6.22: Compute E aa. 
-2 


Solution for Problem 6.22: If you weren’t paying close attention, you might do this: 


Bogus Solution: 


We can’t do this, because the function is not defined at 0! To be a little more precise, the function 4 does not have 
an antiderivative on the interval [—2, 3], because it is not defined at x = 0, so we cannot apply the Fundamental 
Theorem of Calculus. 


In order to evaluate the integral, we need to break it up into a sum of two improper integrals at the point at 


which the function is undefined: 
1 1 
pe [fart [See 


As we saw in Problem 6.21, both of these diverge. Thus, the original integral itself diverges. 0 
1 
More generally, when computing something like fs = —, it might be tempting to say “+ is an odd function, so 
the integral from —1 to 0 will cancel out the integral from 0 to 1, and thus the overall mum is 0.” This is also the 


result that naive calculation will give: 


Bogus Solution: 


1 1 
f Se log bl = Ioatiy lott) = 0. 
Lape et 
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But this is not correct! The only way legally to evaluate this integral is to break it up into its improper parts. 


dx ‘i dx a " dx 
—= —+ —, 
-] = -] * 0 * 
Neither part converges, so the integral diverges. 


EXERCISES 
6.5.1 Compute the following improper integrals: 


1 1 - ane ' 
a) ih (2x = 1) ax (b) { x(log x)? xp dx (c) i: xe* dx (d) } 7_. dx 
2 


1 
(a) Compute it inn 


1 
(b) Compute ETT he 
6.5.3 Compute xe dx. 
0 


6.5.4 Show that it doesn’t matter at which point we break up a doubly-improper integral. Specifically, show that, 
for any c,d € (a,b), if ‘4 f and f f converge, then vg f and (4 f also converge, and 


refo-Lee fe 


Hints: 230, 97 
6.5.5% 


(a) Show that if f(x) dx converges, then {. f(x)dx = lim dé f(x) dx. Hints: 165 


(b) Show that the converse of part (a) is not true; that is, it is possible that lim [- f(x) dx converges but that 
800 Jin 


f(x) dx diverges. Hints: 82, 225 


—co 


REvIEW PROBLEMS 
6.23 Compute the following: 


_ 2-1 .. cos?x—1 ; ie 
ote SS -—.. ~ Se eee 
6.24 Suppose a and b are nonzero real numbers. Find lim — bi as and lim — Hints: 32 
in bt +0 tan bt’ 


6.25 Compute 


(~ oot le 1 
—2x ; . 
(a) e ~ dx (b) { 3 dx (c) a Ste sie dx Hints: 72 
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1 
dx 
6.26 Compute ———.. (Source: HMMT) Hints: 29, 265, 44 
pute ve+ ae 


6.27 Compute lim ( Vx2 +x - x). (Source: [Sp]) Hints: 135 


6.28 Sometimes, “little-o” notation is used to describe the growth rates of functions. Specifically, if f is a function, 
then o(f) is the set of functions defined as: 


gtx) | 
lim $7 =o} 


o(f) = {s 
Prove the following: 


(a) o(f)o(g) € o(fg) (This means that if h; € o(f) and hz € o(g), then hyh2 € o(f g)) 
(b) o(0(f)) € o(f) (This means that if hy € o(f) and hz € o(h;), then hz € o(f)) 
(c) If f isa polynomial and g(x) = e*, then f € 0(g) 


CHALLENGE PROBLEMS 


6.29 The gamma function is defined for a positive real number z as 


T(z) = ‘* ge te * dae 
0 
(a) Compute [(1). 


(b) Show that I'(z + 1) = zI'(z) for any z > 0. Hints: 195 
(c) Use (a) and (b) to find a simple formula for '(n), where n is a positive integer. Hints: 212 


6.30 Suppose a,b,x,y are all positive and a + b = 1. Compute lim (ax' + by')!. 


6.31 Recall from page 140 the error function, defined by 


ae aa 
erf(x) = —— i e dt. 
vi Jo 
It has that weird constant in the front because 
is ef dpe ME, 
0 2 


(We'll prove this later in the book.) Using this fact, compute 


- ott dx, 


where a,b are real numbers with b + 0. 


6.32 


(a) For anya < b, show that lim ig sin mx dx = 0. Hints: 139 
m—-co a 
(b)*x Show that if f is continuous on [a,b], then Jim, za f(x) sin mx dx = 0. Hints: 14, 55, 187, 40 
a 
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1 

t 

6.33x Let f be a continuous function on [0,1]. Evaluate lim x os dt. (Source: [Sp]) Hints: 193, 244, 228, 98 
x—0* x 


dx 


Pr rg (Source: Putnam) Hints: 271, 133, 169, 283 


6.34 Evaluate si 
1 


6.A Proor oF L’HOpitTAt’s RuLE 


Before we can rigorously prove l’H6pital’s Rule, we need to have a more general version of the Mean Value 
Theorem. 


Problem 6.35: Let f, g be continuous functions on an interval [a,b] and differentiable on (a,b). Show that there 
exists c € (a,b) such that 


F’ON(g(b) — g(@)) = 8’ OF) - f@)). 


Note that if g(x) = x, then g’(c) = 1 for all c, and the above expression can be rearranged as 


7 _ fe) - f@ 
fO= en 


, 


which just the Mean Value Theorem. 
Solution for Problem 6.35: Define a function h as 
h(x) = f(x)(g(b) — g(a)) — g(x)(f(b) — f(@)). 
Notice that 
h(a) = h(b) = f(a)g(b) — f(o)g(@), 
thus by Rolle’s Theorem, there exists c € (a,b) such that h’(c) = 0. Therefore, 
0 =h'(c) = f'(c)(g(b) — g(@)) — 8’ (O(F(0) — F@), 

which can be rearranged to f’(c)(g(b) — g(a)) = 9’(c)(f(b) — f(a), as desired. 0 

The result of Problem 6.35 is called the Extended Mean Value Theorem. 


General statement of l’H6pital’s Rule: In the following statement, 


e 9 can bea real number, a “one-sided real number” (for example, lim ), or 
a1" 
too, and 


e @can be 0 or +00. 


Let f, g be differentiable functions such that 


lim, f(x) = lim g(x) = 4. 


po rr ag TO 
provided this latter limit exists. 
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We will first rigorously prove the “basic” case of l’H6pital’s Rule. 


Problem 6.36: Prove l’H6pital’s Rule where @ = 0, 9 = a for some a € R, and 


for some L € R. 


Note: The modifications to the following proof in the cases that 9 = a* or 9 = a” should be clear, and we will 
not present them. 


Solution for Problem 6.36: We alter f (if necessary) by setting f(a) = 0; note that since lim f(x) = 0 by assumption, 
x—a 
this makes f continuous at a. Similarly alter g if necessary so that g(a) = 


We are assuming that lim os cS - = L. This means that by definition, for any € > 0, we can choose 6 > 0 such that 
0<|x-al<6 > £@)_5| (6.A.1) 
8’(x) 


Note that this implies that g’(x) # 0 for all 0 < |x —a| < 6. This further implies that g(x) # 0 for all 0 < |x —al < 6, 
since if g(x) = 0 = g(a), then by the Mean Value Theorem there is some € between x and a such that g’(€) = 0, a 
contradiction. 


We wish to show that 


f) 


0<|x-al<6 => a) i] <e. 
But f(a) = g(a) = 0, so we can write 
f(x) -1 _ |foO-f@ - 
g(x) g(x)- g(a) | 


Now we apply the Extended Mean Value Theorem: there exists z between x and a such that 
FS) = g(a) = 8’ F) — F(a). 
Furthermore, since we know that g’(z) # 0 and g(x) # 9(a), we may divide by 9’(z)(g(x) — g(a) to get 
f@)-fa@) _ f'@) 
g(x) - g(a) g’(2)’ 
and since 0 < |z —a| < |x —a| < 6, we have by (6.A.1) that 


f'@ _|fO-f@ _ f(x) 
“|@) 1 ~ 1g(x) = g(a) = (x) 1. 
Thus, by the 6-e definition of limit, we conclude that 
imi (x) 
=L, 
lim g(x) 


as desired. 0 


Problem 6.37: Modify the solution to Problem 6.36 to i l’H6pital’s Rule where @ = 0, 9 = a for somea € R, 
and 
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Solution for Problem 6.37: If L is +00, then all the statements in the solution to Problem 6.36 of the form 
something —L} <e 
get replaced by statements of the form 


|something| >N(ifL=+00) — or |something| < N (if L = —oo), 


but all of the same arguments hold. 0 


Solution for Problem 6.38: By Problems 6.36 and 6.37, we know that I’H6pital’s Rule is true for 9 = 0. Thus, we 
can prove the case where Y = +c0 by replacing f(x) and g(x) by f(1/x) and 9(1/x), respectively. In particular: 


J) ~— fs) 
oR, ox) so s(n) 
# f(1/u) 
~ yor 
iu 8(1/U) 

” (-1/u?) f’(1/u) 
u0* (—1/u?)9’(1/u) 
= lim 

u0* 9/(1/u) 

_ f'@) 

ae 


QO 


The previous problems prove l’H6pital’s Rule in all cases where # = 0. Unfortunately, the case where # = +co 
is considerably harder. We will prove one specific case of this, and leave it to the reader to extend it to other cases 
in a similar manner as above. 


We first prove the following lemma: 


Problem 6.39: Let f be a function such that jim f(x) = 00, and let a € IR. Show that 


im 20+? | 


lim “ay = 


Solution for Problem 6.39: We simply compute: 
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Solution for Problem 6.40: By definition, for any € > 0, there exists some M > 0 such that 


(6.A.2) 


(The reason that we use § rather than e will become more clear later in the proof.) Also, for all x > M, there exists, 
via the Extended Mean Value Theorem, some z € (M, x) such that 


fx)-f(M) _ f’'® 
g(x) — (M)  ’@)" 


(We have g’(z) # 0 and g(x) # g(M) for essentially the same reason as in Problem 6.36; we will not repeat the 
details here.) This, combined with (6.A.2), means that 


fx) - f(M) € 
>M => |——— -LI<-. 
: s@-gM)1~4 
However, we want to be able to compare & to L, so we introduce some algebraic sleight-of-hand. Define a new 
function 
a) =< £0 _, 8) = 9) 
F(x) — f(M) six) 


e@) ~ e-em 


By Problem 6.39, we know that 


xXx—-090 x—-00 


J=1-1=1. 


F(x) — f(M) 8(x) 
Thus, we can choose N > M sufficiently large so that for all x > N, 


Wi) ~ 11 < min{1, 7}. 


6.A. PROOF OF LHOPITAL’S RULE 


Then for all x > N, we have |j(x)| < 2, and we compute (for L # 0): 


fe) _ i c te — f(M) 

g(x) g(x) — g(M) 

oe, = 

eae it) — Lj) 

ey FC) FM) _ 
= WOOF eG — gM) 

€ € 

<2(7)+1U(s) 


Se. 


Jie = u 


+ |Lj(x) - L| 


i + {Lj — 11 


(If L = 0, then we only require N sufficiently large so that |j(x) — 1| < 1 for all x > N, and in the above computation, 
the second term on the next-to-last line will be 0.) 


In summary, we have shown that for any € > 0, there exists N such that 


f) 


x>N >  - 1 <e, 
g(x) 


thus we have established lim fC) SF 
x00 8(X) 


The proof that l’H6pital’s Rule holds as in Problem 6.40 for L = +c will be left to the reader. 
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CHAPTER a oe 


SERIES 


7.1 INFINITE SEQUENCES 


Definition: A sequence is a list of numbers. 


Occasionally this list will be finite, but most often in calculus we deal with infinite sequences. We may write 
an infinite sequence as a list of numbers separated by commas, with an ellipsis (...) to indicate that the sequence 
is infinite, assuming the pattern is clear. For example, 


1,2,4,8,16, 32, 
is the infinite sequence consisting of all the positive powers of 2. 


However, more often we refer to elements of a sequence using variables, like so: 
A1,A2,03,..+, 


where the index of each term in the sequence is denoted using a subscript. We would denote the entire sequence 
as {a,}*_,. Sequences need not start at n = 1. For instance, some sequences might start at n = 0 rather than n = 1; 
we would write such as sequence as {a;} . 


Note, however, that a sequence is not the same as a set (despite the similar-looking notation), because a key 
facet of a sequence is that the terms are ordered: there is a first term, then a second term, then a third term, and so 
on. We often omit the bounds (n = 1 and oo) if the context makes clear that {a,} is a sequence (and not a set). And, 
of course, if the sequence is finite rather than infinite, we replace co with the index of the last term. 


A very common and useful type of sequence is a geometric sequence, which is a sequence in which the ratio 
between every two consecutive terms is constant. For example, the sequence 


3,6, 12,24, 48,... 


is a geometric sequence with first term 3 and common ratio 2. 
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Naturally, there is a more formal definition: 


Definition: A sequence {a,}"_, is a geometric sequence if there exists a real number r such that ay, = ra, for | 


n=1 


all k > 1. The number r is called si common ratio of the geometric sequence. 


Problem 7.1: Suppose {a,}*°_, is a geometric sequence with common ratio r. Write a formula for a,, in terms of 
a, and r. 


Solution for Problem 7.1: To reach a, in the sequence, we start with the first term a; and multiply by the common 
ratio r a total of n — 1 times (since we must move n — 1 terms forward in the sequence to get from a to a,,). Thus, 
the formula is a,, = r"~!a,. 0 


WARNING!! It is a common mistake to write a,, = ra for the n'* term of the geometric 


series starting at a; with common ratio r. We need to multiply a; by r only 
n— 1 times to get the n" term, so the formula is a, = r"~!a. 


Sequences can also be defined recursively, meaning that each term is defined in terms of one or more previous 
terms. 


For example, a geometric sequence {a,}°°_, with common ratio r can be defined recursively by 
ay = Tak-1 


for all k => 2. Note that this recursive formula does not tell us anything about the first term a;—it must be given 
or assumed—but all subsequent terms can then be expressed recursively. 


An advantage of the recursive method of defining a sequence is that it gives us more flexibility to define other 
sequences that would be hard to define explicitly. For example, the Fibonacci sequence is the sequence that starts 
1,1,... and where each subsequent term is the sum of the previous two terms: 


1,12,3,9,8, 13/215 34, Vas 


This is recursively defined by ap = a, = 1 and ay = a_i + a-2 for all k > 2. Note that this recursive formula defines 
each term a, (for k > 2) using the two previous terms (rather that just one term as in the recursive formula for 
a geometric sequence), and as such we must specify the first two terms of the sequence, ap and aj, in order to 
completely define the sequence. 


Sidenote: Althoughnot used in calculus, the Fibonacci sequence occurs quite often in discrete 
S mathematics. You can read more about the Fibonacci sequence and its applications 


in Art of Problem Solving’s Intermediate Counting & Probability textbook. 


Another way to define an infinite sequence is as a real-valued function whose domain is the positive integers 
(or possibly the nonnegative integers). Specifically, we think of {a,}"_, as a function a such that a(n) = a, for any 
positive integer n. This is sometimes called a discrete function, because its domain includes only the discrete 
values 1,2,3,.... 


A natural question that we can ask about an infinite sequence {a,}" _, is whether it converges to any number. 
In other words, we can try to compute lim a,,. This has the same formal definition as lim a(x) does for a function 
n—co x—0o0 


a, except that our sequence is a discrete function and not a function defined on all of R. 
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Definition: We say that an infinite sequence {an} converges to L, also written 


tim @4:= EG; 


n—-0o 


| if for any € > 0 there exists a positive integer N such that for any n > N, we have |a, — L| < e. 


In words, this means that we can choose N sufficiently large so that all terms of the sequence after ay are within 
e of L. Thus, the sequence gets “arbitrarily close” to L as n gets arbitrarily large. 


If the sequence does not converge, we say it diverges. 


Because the definition for the convergence of an infinite sequence is basically the same as that of the infinite 
limit of a function, we can often use our function tools to compute sequence convergence. Specifically, if we can 
write a, = f(n) for some function f whose domain includes the positive real numbers, then 


Haan = Bat), 


assuming the limit of the function is defined. 


WARNING!! The sequence might converge even if the function doesn’t, so be careful! 
“S For example, the sequence a, = 0 trivially converges to 0, but the function 


f(x) = sin mx does not have a limit as x — 00, even though f(n) = a, for all 
positive integers n. 


Problem 7.2: Let {a,}°_, be a geometric sequence with common ratio r. When does this sequence converge, 


=] 
and to what value (in terms of a; and r)? 


Solution for Problem 7.2: If r = 0, then the sequence is 4;,0,0,0,..., which clearly converges to 0. If r # 0, then we 
have a, = r"~'a,, and we compute 

lim rq; = “4 lim r. 

x-00 xX—00 


If a, = 0, then the limit is 0, and the sequence is the constant sequence in which every term is 0. Otherwise, if 
a, # 0, then the sequence converges if and only if * converges. This will converge if and only if r € (—1,1]. Note 
that if r = 1, then the sequence is constant, and if |r| < 1, then the sequence converges to 0. Also note that the 
sequence does not converge if r = —1, since (—1)* does not converge as x — oo, but instead the sequence alternates 
between a, and —a;. 0 


Some sequences are a lot trickier to analyze. For example: 
Problem 7.3: Does the sequence a, = (1 + 1)" converge? 


Solution for Problem 7.3: This sequence consists of the values at positive integers of the function f(x) = (1 + 1)", 
We computed the limit of this function in Problem 6.14, and thus 


lim a, = lim f(x) =e. 
n-co x-00 
O 


In many instances, we don’t really care what value a particular infinite sequence converges to; we'd just like 
to know whether it converges or not. In these cases, we can examine sequences a bit more qualitatively, as in the 
following examples. 
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Problem 7.4: Suppose that we have a sequence {a,,} that is increasing: that is, ay < a,,; for all k > 1. Show that 


the sequence converges if and only if it is bounded above, meaning that there is some number M such that 
4, <M for all positive integers n. 


An example of such a sequence is 
a 
, ya ahs 4’ rae 
1 
that is, the sequence given by a, = 1 — = for all positive integers n. This sequence is increasing, and clearly 
converges to 1. 
Solution for Problem 7.4: First, we assume that the sequence is bounded above; we will prove that it converges. 
Thinking back to Chapter 1, what do we know about a nonempty set of real numbers that has an upper bound? 
We know that the set has a least upper bound, called the supremum of the set. As you recall, this is one of the 
defining axioms of the real numbers. 
We claim that if {a,,} is increasing and bounded, then 
lim a, = sup{ay}. 
Note that when we write sup{a,}, we are thinking of the elements of the sequence merely as a set and not as an 
ordered list. 


To prove our claim, let A = sup{a,}, and let e > 0 be given. Choose N such that ay > A — €. Such an N must 
exist, because if it does not, then A — € is an upper bound for {a,,}, contradicting the fact that A is the least upper 
bound for {a,}. Then, since the sequence is increasing, we have 


n>N => Azan >an>A-E€ > OK<A-aA <e. 
Thus, by the definition of convergence, we conclude that {a,} converges to A. 


We leave the opposite direction of the proof—that is, proving that if an increasing sequence converges, then it 
has an upper bound—as an exercise. 0 


There is a similar statement for decreasing sequences: if a sequence {a,}*°, is a decreasing sequence, so that 
A411 S< a for all k > 1, then the sequence converges if and only if it has a lower bound, and if this is the case, 
then the sequence converges to the greatest lower bound inf{a,,}. The proof is virtually identical to the solution to 
Problem 7.4, so we will omit it. 


We have some terminology: 
Definition: 


e Asequence {a,}*_, is called monotonically increasing if a; < 4,1 for all k > 1. 


e A sequence {a,}"° , is called monotonically decreasing if a, > aj,; for all k > 1. 


e Asequence {a,}* , is called strictly monotonically increasing if a, < a; for all k > 1. 
e Asequence {a,}°_, is called strictly monotonically decreasing if a; > a.) for all k > 1. 


n=1 


e In any of these cases, {a,} is called monotonic. 


Thus our conclusion is: 


| Concept: A bounded monotonic sequence converges. | 
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Let’s look at a couple more sequence convergence examples. 


Solution for Problem 7.5: It sometimes helps to list a few terms, to get some idea of what’s going on: 


! 
a= - = 
ae ee 
2 2’ 
33 9’ 
a4 = = ae 
44 32 


It seems that the sequence is strictly monotonically decreasing, and that 0 is the greatest lower bound. So we 
suspect that the sequence converges to 0. Let’s try to prove both of these facts. 


First, we want to show that @, > an41 for all n > 1. Since all the terms are positive, this is equivalent to showing 
An+1 
that —— <1. 


Rf ae 


; As: 
Pre is PEATE ef 4 MOET ow! ce oe 
Dist oe abscelle ace ll lar hla ie ieee tec me! WR eb! indeed: tails tnt Rar ead cet 


Let’s write out this expression: 


1)! 
anes _ Gat _ (+1)! _ (n+ Wn" “(cty) 


oe mn o(nt+ 1" (n+ ty! s\n +1 


n" 


So the sequence is strictly monotonically decreasing. Since it has a lower bound (namely, 0), we know it converges. 
We'd like to show that 0 = inf{a,} so that we can conclude that the sequences converges to 0. 


To show that 0 is the greatest lower bound, we need to show that the sequence gets arbitrarily close to 0 as n 
grows large. To see this, let’s write a, in a slightly different way: 


The last term is 2 = 1, so we have 


We notice that at least half of these terms are less than or equal to } (in particular, the terms where the numerator 
is at most 5), and the rest of the terms are less than 1. Therefore, 


ire 
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Thus 
1 
O<an < n-1 
(v2) 
for all n > 1, and taking the limit as n — oo and using the Squeeze Theorem, we have 
0 < lima, < lim =x x9 = 0, 


n—0o n—0o ( v2)"" 


Thus, inf{a,} = jim an, =H 9 


Problem 7.6: Determine if the sequence a, = in converges, and if so, to what value. 


Solution for Problem 7.6: Asin the previous problem, experimenting (perhaps with your calculator) might convince 
you that this sequence is monotonically decreasing and that 1 is a lower bound. So we might guess that it converges 
to 1. 
We can try computing 
lim x?. 
x—- 00 


If this limit exists, then it is also the limit of our sequence (although, recall the warning on page 222—the sequence 
might converge even if the above limit does not exist). Noticing that this function is an co° indeterminate form as 
x — oo, we try taking logs and using l’H6pital’s Rule. Specifically, we have that 


1 _ logx 
logx* = grees 


We see that er is an = indeterminate form, so we apply l’H6pital’s Rule, giving 


1 a 
lim —8* = lim = = lim — =0. 


x00 X x—00 


Taking exponentials, we get lim x? = 9 = 1, and hence the original sequence converges to 1. 0 
x—0o 


EXERCISES 


7.1.1 Ineach of the following, determine if the sequence {a,}” _, converges, and, if it converges, what it converges 
to. 
n> —1 _ 3n? + 2¢e" 


(a) an = + One+on+1 ) & n—e 


(d) a, = Vr2+1- Vn+1° Hints:266 (e) a, = Vn2+1 Hints: 18 


7.1.2 Prove that when each term in a geometric sequence is multiplied by a constant k, then the resulting sequence 
is also geometric. 


7.1.3 


nsinn 


(c)-a@,= re Hints: 234 


(a) Show that nonzero numbers x, y, and z are consecutive terms of a geometric sequence if and only if y? = xz. 
(b) Isit true that nonzero numbers x, y and z are consecutive terms of a geometric sequence if and only if y = xz? 


7.1.4 Prove a stronger version of the “only if” direction of Problem 7.4: if an infinite sequence converges, then it 
is bounded. 
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7.2 INFINITE SERIES 


Summing the elements of a sequence results in a series. If the sequence is finite, this is pretty easy, although 
the technical details of the definition may be somewhat more complicated than you are used to: 


Definition: The series corresponding to the finite sequence {a,}_, is the finite sequence of partial sums of 
the initial terms of the sequence: 
Ay, Ay + Az, Ay +A, +03, ...,0, tAgt-:+ +H. 


k 
The final term a; +++-+a = )" a; is the sum of the series. 
i=1 


asae' ets The terminology can be somewhat ambiguous. We often refer to the “series” 


k 
as simply the sum ve even though this is really a number. However, 


i=1 
inherent in the definition of “series” is the order of the terms that we are 
summing, which is why it is more proper to consider a series as a sequence 
of partial sums. This is not much of a concern with finite series, but as we 
will soon see it is more of an issue for infinite series. 


More generally, sometimes we say that a “series” is the sum of a sequence, 
and other times we “sum a series” to get a value. Usually the context makes 
it clear what we are talking about. 


If our sequence is finite, then there’s really no issue: we just sum the terms. In particular, if we have a finite 


sequence {a,,}«_,, then the sum of the corresponding series is 
q n=1 P & 


a +Agtes*+Q= Qj. 


k 
i=1 


We see this with a basic example that you probably already know: 


Problem 7.7: What is the sum of the series corresponding to the finite geometric sequence with n terms, first 
term a, and common ratio r? 


Solution for Problem 7.7: Let’s write the sum explicitly: 


Sum =a+ratrat:--+r a 
=a(l+rt+r+-+-+r™), 


There are two ways to finish from here. One way is to recall the factorization 
rm —1=(r—-1)(r"1 +r"? +--- +1). 


Thus we can write our sum as 
a(r" —1) 


r-1 
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Of course, we can’t do this if r = 1. In that case, the sum is just na. 
The other method is to denote the sum as 
S=aQltrtr te ¢r™), 


so that 
rS=a(r+r+r+---+r"). 


When we subtract, most of the terms cancel: 


r§ —-S=a(r" -1), 
and dividing by r — 1 (again assuming r # 1) gives 
_ ight =) 
S= = ie 


The sum of the series corresponding to the finite geometric sequence with n terms, 
a(r” — 1) 


first term a, and common ratio r is ifr #1,and naifr=1. 


r-1 


Much more interesting, and more useful to calculus, are sums of terms of infinite sequences, and in particular 
the “sum” of an entire infinite sequence. Since we can’t formally evaluate an infinite sum, we do the most logical 
alternative: we take the limit of finite sums. 


Definition: The infinite series corresponding to the infinite sequence {a,,}*° , is the infinite sequence of partial | 
sums of the initial terms of the sequence: 


Qj, @1 +o, G+ F443, ... 


The sum of the series is defined as the limit of the terms of the series; that is, 


If this limit exists and is equal to some real number S, we say that the series converges to S; if the limit is too 
or does not exist, we say that the series diverges. 


Lsicay’ ee The terminology tends to be a bit sloppy. We often refer to the “infinite 
series” as the “sum” > a even though this is really a limit of the partial 


i=1 
sums in our series. 


Let’s look at our basic example: 


Problem 7.8: What is the sum of the series corresponding to the infinite geometric sequence with first term a 


and common ratio r? 


Solution for Problem 7.8: If r = 1, then the geometric sequence is the constant sequence 4,4a,a,..., and thus the 
corresponding series does not converge unless a = 0. So let’s assume that r # 1. We need to compute 


lim ar’ — 1) 


n>co r—1 
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If a = 0, then the sum of the series is trivially 0, so let’s further assume that a is nonzero. 


We can factor out the constants: P 


r-1 


lim (r” — 1). 
n—-co 


We now have cases depending on the value of r. 


If |r| < 1, then r” approaches 0, and the sum becomes — Note that this works regardless of whether r is 
positive or negative. 


If r = —1 or |r| > 1, then the limit of r” diverges. So the sum diverges. 0 


Given an infinite geometric sequence with first term a # 0 and common ratio r, 
the corresponding series converges to 7 if |r| < 1, and diverges if |r| > 1. 


Solution for Problem 7.9: We can sum an infinite geometric series if we know the first term and the common ratio, 
but in this problem the first term is not given to us. However, we can let the first term be a and see if we can find 
any helpful equations to work with. The sum of the first series, in terms of a, is 


We can find the sum of the second series in terms of a and r in the same way. We note that the second series is 
geometric with first term a? and common ratio r*, so we have the sum 


a 
—; = 4. 
1-r : 

Thus, we now have two equations involving r and a: 
a 
— =15; 
L=r P 
2 

a 
—; = 45. 
1-r : 


To solve this system of equations most easily, we can multiply through by the denominators: 
a=15(1-7n), 
a@ = 45(1-7°). 


We notice that the first equation involves expressions that are both factors of the corresponding parts of the second 
equation. This gives us the idea to divide the second equation by the first equation, giving a = 3(1 +r). 


We now have two linear expressions for a, so we can equate these expressions and solve for r: 

15 — 15r = 3 + 37, 
ae _ _ vs 
giving 12 = 18r and r = §. 
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To finish the problem, we need the first term of the original series, which is just a, so 
a=3+3r=3+2=5., 
O 


EXERCISES 
7.2.1 Suppose {a,} and {b,,} are infinite sequences such that the corresponding infinite series converge: 


Tak and Shank 


i=1 =1 


(a) Let m € Rand let {c,} be the sequence such that c, = ma, for all n. Prove that 


(b) Let {d,,} be the sequence such that d,, = a, + b, for all n. Prove that 


He =A+B. 
i=1 


Hints: 158 


7.2.2 Sum each of the following infinite geometric series: 


(a) =t+atat: 
(b) 192+144+108+--- 


v2 1 
(c) g-¥eel=F55- 


7.2.3 Let {a,}° , be an infinite sequence, and suppose that a, = by, — by+1 (for all n = 1) for some sequence {b,} 


co 
n=1* 


(a) Show that, if lim b, = 0, then 3 Ay = b,. (We call this method of computing the sum telescoping.) 
n=1 


(b) Compute is 


n=1 


>=. Hints: 205 
n~+n 


) 6" . 
(c) Compute > Gr — a)gr Dy" (Source: Putnam) Hints: 150, 116, 129 
n=1 


2. ile 2 


’ 1 , 
7.2.4 Find the sum 5 + mn + a + 7 +--+, (Source: AHSME) Hints: 239 


7.3 SERIES CONVERGENCE TESTS 


The first question that we usually ask when presented with an infinite series is: does it converge? In this 
section, we will discuss a number of convergence tests that we can perform on infinite series. Most of the tests 
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that we present in this section only work for infinite series corresponding to sequences with nonnegative values; 
in Section 7.4, we will discuss methods to study series with both positive and negative terms. 


Before going on to more sophisticated convergence tests, we should note one basic fact about convergence that 
should be clear: 


Problem 7.10: If os a, converges, then what do we know about jim An? } 
n=1 


Solution for Problem 7.10: Intuitively, we can reason as follows: “We’re taking an infinite sum of terms of a 
sequence, so if that sum ends up being finite, then the terms that we are summing had better get arbitrarily close 
to 0. So the limit of the sequence must be 0 if the series converges.” 


This intuitive reasoning is absolutely correct. The proof, however, is a bit technical. If you are convinced of 
the intuitive reasoning, feel free to skip to the end of the solution; if you want to see the gory details, read on. 


Let L= ) a, be the sum of the series. Then we know that given any e > 0, there exists some N so that for any 


co 
k=1 
n>N, 

n 


L-)\ a 


k=1 


Se. 


In particular, we can compute the sums for n and n + 1: 


n+1 


L-) a 


k=1 


n 


L-)\ a 


k=1 


<e€ and <€. 


What happens when we add these inequalities? If we reverse the order inside one of the absolute values, then we 
can make most of the terms cancel when we apply the Triangle Inequality: 


n 
L-) a 


k=1 


n+1 


yi a -L 


k=1 


n 


n+1 
Ya- doa 


k=1 k=1 


2€ > + > = lan+il. 


Thus we get |a;,41| < 2e. But this is true for all n > N. So we have proved that, given any 2e > 0, we can find N > 0 
such that |a,| < 2e for any n > N. This, by definition, means that lim a, = 0. O 


Taking the contrapositive of Problem 7.10 gives us our most basic test of convergence or divergence: 


Important: The Divergence Test: if {a,,} is a sequence such that jim ay # 0or jim a, does 


not exist, then "a, diverges. 


One common infinite series that occurs in calculus (and indeed in higher mathematics in general) is: 


where p > 0 is some fixed real number. This is called a p-series. If p = 1, this is the harmonic series: 


Aaee i de cece, 
2 3 4 . 
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Problem 7.11: Does the harmonic series converge, and if so, to what? 


Solution for Problem 7.11: We can list the first few partial sums: 


1", 
1+5=15 

145 + 5 = 1.8333... 

Lt 5 +5 + 7 = 2.0833... 

Lt 5 tat gt = 22833... 


It is clear that these sums form a monotonically increasing sequence, and because the growth rate of the sequence 
is slowing, we may suspect that it is bounded above, and thus converges. 


However, it diverges, and there is a relatively simple yet ingenious proof. The clever idea is to group the terms 
by powers of 2: 


Yat 
n=1 cl 
*9 
ee: 
3 oe 
5° 6 7 8 
9 1 HT ff 8. 4 3 6 
sca 


Note that we can continue this process indefinitely, and that each row (after the first) will be greater than }. Thus, 
the sum grows without bound, hence the series diverges. 0 


The reasoning that we used in the previous solution is a basic example of the Series Comparison Test, which 
we explore in the next problem. In this problem, we also introduce the shorthand notation ¥ a, for the sum of the 
series corresponding to the sequence {a,,}>°,. 


Problem 7.12: Suppose {a,} and {b,} are two sequences such that 0 < a, < by, for all n. 
(a) If Yb, converges, what can we conclude about ¥ a,? 


(b) If Ya, converges, what can we conclude about ¥, b,? 


(c) What conclusions can we make about divergence? 
(d) Why do we need the condition that the terms of the sequences are nonnegative? 


Solution for Problem 7.12: 


(a) The key idea is that the sequence of partial sums 


Q,,4, + 2,a, + a2 +43,... 
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is a monotonically increasing sequence (since all the a; are positive). So to prove it converges, we just need 
to show that it has an upper bound. Fortunately, we have a very likely candidate for an upper bound. Let 


B = tim J" by - ssp) 
k=1 


k=1 


We claim that B is an upper bound for the sequence of partial sums of the a;. This is clear, since for any 


positive integer n, 
n 


yim < yh < B. 
k=1 


k=1 
So the partial sums of {a,} are bounded, and thus the monotonically increasing sequence of partial sums of 
{an} converges, which by definition means that the series }' a, converges. 


(b) Part (a) does not work in the other direction. To take a simple example, let 


12.1 


an} = 1, 5,97 gr 


and <4 4 
{b,,} = Lovgrge 
i 1 
Note that a, = mi < te b, for all n > 1. Yet }' a, converges whereas )' b, diverges. 


(c) We just take the contrapositive of part (a). That is, if {a,,} diverges, then {b,} must diverge too (since if {b,} 
converged, we would have to have {a,,} converge as well). 


(d) None of this works if the sequences have negative terms, because then the sequences of partial sums of the 
series will not be monotonically increasing. However, the problem can be suitably modified if all of the terms 
of both sequences are nonpositive; we will leave this as an exercise. 


Problem 7.12 can be distilled to: 


Important: The Series Comparison Test: suppose {a,} and {b,} are two sequences such that 


Vv 0 <a, < by for all n. 


e If )b, converges, then a, converges too. 
e If Ya, diverges, then ¥ b, diverges too. 


Problem 7.13: Prove that the p-series 1m ‘ diverges for all0 <p <1. 
k=l 


Solution for Problem 7.13: We already saw in Problem 7.11 that the harmonic series (that is, the p-series for p = 1) 
diverges. But when 0 < p < 1, we have 0 < k? <k for all k > 1, hence 


sepa 
ee as 
kp k 
for all positive integers k. Thus the terms of the p-series with 0 < p < 1 are greater than or equal to the corresponding 


terms of the harmonic series. Therefore, by the Series Comparison Test, since the harmonic series diverges, so 
does the p-series. 0 
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We'll examine the p-series for p > 1 with our next tool. 


Whenever we have a sum, we can try to express this sum as a Riemann sum of an appropriate function over an 
appropriate interval, and then use our knowledge of the corresponding definite integral to get information about 
the sum. For example, let’s revisit the harmonic series: 


Guat ale eee 
2 Aan -4 . 


Solution for Problem 7.14: 


(a) Writing the sum in summation notation might help to see it more clearly: 
r= i 


This is a Riemann sum of the function f(x) = 1/x using boxes of width 1, 
as in the diagram to the right. Note that the first box has area 1, the second 
box has area 3, the third box has area 4, and so on. Thus, we have cleverly 
arranged it so that all of boxes have the maximum possible height; in other 
words, our Riemann sum is an upper Darboux sum of f(x) = 1/x on the 
interval [1,n + 1]. 

(b) Using part (a), we see that the sum is strictly larger than the area under y = + 
along [1,n + 1]; that is, 


n 1 +1 y 
vie f = dx = log(n + 1) — log(1) = log(n + 1). 
ak a 


(c) Asn — ©, we see that the partial sums grow without bound, since jim log(n + 1) = co. Therefore, the sum 
diverges. 


oO 


Problem 7.14 is an example of what is called (not surprisingly) the Integral Test for infinite series. But note 
the geometric feature of the graph that made it work: the function f(x) = + is a decreasing function. That’s what 
ensured that the boxes that make up our Riemann sum are larger than the area under the curve, so that the partial 
sum is in fact an upper Darboux sum that is an upper bound of the integral. 
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The test works for convergent series too (though we will leave the proof of this as an exercise). Here is the 
complete result: 


Important: The Integral Test for series: Suppose a, = f(n) for some positive decreasing 
continuous function f. Then 


has the same behavior as 


they both converge or they both diverge. 


WARNING!! __ Inthe convergence case of the Integral Test, the value of the integral typically 
“S will not equal the sum of the series. 


For example, the series 


ie Dp ad 
Pa” Sp 


converges to 1. If we apply the Integral Test, we get 


Le eae log(t/2) 4, — __1 wetOOL = Sethe a hawk 
[xe foam x= [oe at Tog(li2y”  O fog(a/2) = Zilog?" 


However, in the exercises, you will be asked to explore how the value of the integral is related to the sum of the 
series. 


We can now settle the remaining cases of p-series: 


Problem 7.15: Determine the values of p > 0 such that y .: converges. 


n=1 


Solution for Problem 7.15: In Problem 7.13, we showed that the p-series diverges for 0 < p < 1. It is perhaps not 
too hard to believe that the p-series converges for p > 1. We can easily show this using the Integral Test: 


1 : 1 : 1 1 
[veka | y= jin (pa }) 
If p > 1, then lim a = 0, so the integral converges, and hence by the Integral Test, the p-series converges. 0 


Important: 


The series 8 2 converges for p > 1, and diverges for 0 < p <1. 


n=1 
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Sidenote: 


When we allow p to be a complex number with real part greater than 1, the p-series 
generalizes to the Riemann zeta function, although traditionally it is written with 
the letter s instead of p: 
ere 
Cs) = V3 vir 


n= 


The Riemann zeta function is the subject of perhaps the most famous unsolved 
problem in higher mathematics, the Riemann Hypothesis, which is well beyond 
the scope of this book. 


There are a couple of other series convergence tests that are important and useful. The first of these compares 
an unknown series to a known series that is “close to” the unknown series: 


Important: |The Limit Comparison Test: Let {a,} and {b,,} be positive sequences, and sup- 
pose that 


lim. 2 Sc 0; 


no by 


Then ¥. a, and ¥ b, either both converge or both diverge. 


Note that the key to this test is that c is nonzero. We cannot use this test if the limit is 0 or does not exist. 
We will leave the formal proof of the Limit Comparison Test as an exercise. The statement should make some 
intuitive sense: basically it means that {a,} and {b,} (up to a constant) behave the same as n goes towards infinity. 


¢ & oe 
A more precise proof uses our earlier Series Comparison Test: if ~ — c, then a, < 2cb,, for sufficiently large n, so 
n 


2 
ha, converges if ¥; b, converges. Conversely, b, < —a, for sufficiently large n, so ) b, converges if Ya, converges. 
(We leave it as an exercise to fill in the missing details.) 


Let’s see how we use the Limit Comparison Test: 


n2—-n+3 


+ 32 Bn 1 TBE? 


Problem 7.16: Does oe 
n=1 


Solution for Problem 7.16: Note that the limit of the terms is 0, so it might converge. But what we notice is that as 
n gets large, this sequence approaches the harmonic sequence. So we suspect that the sum diverges. We can use 
the Limit Comparison Test to prove this. 


2 
n~=—n +3 1 
We have a, = —=—=>—=——-.. We want to compare it to the sequence b,, = —. This gives 
"~~ B+ 3n2-5n+1 P q 2 6 
lim = lim wwe 
n>0b, n20n3+3n2-5n+1 ~ 
a ils ‘ ‘ , 
The limit of is indeed nonzero, so we may apply the Limit Comparsion Test. Therefore, since the harmonic 


bn 
series diverges, so does our given series. 0 


We use the Limit Comparison Test to compare an unfamiliar series to a simpler 
series whose behavior we know. Usually, our “simpler” series is a geometric 


series or a p-series. 
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We can use the Limit Comparison Test to make a general statement about a series whose terms are given by a 
rational function: 


Problem 7.17: Suppose f(m) and g(n) are polynomials. When does y f(r) 5 eter! 


oat 8k 
f(n) 
Solution for Problem 7.17: If deg f > deg g, then the series satisfies the Divergence Test | since jim Soin) #0, 
the series must diverge. 
If d <d th the t ig f our desired ith the t bn = a 
eg f < deg g, then we compare the terms a, = ve oe our desired series with the terms Sate © 


the p-series with p = deg g — deg f. We notice that 


an _ f(n)- ness-deg f) 
bn g(n) 


is a ratio of polynomials (in terms of n) of degrees equal to deg g, and thus lim = is equal to the quotient of 
00 Dy 


the leading coefficients of f and g. In particular, this limit is nonzero, so we may compare the series using the 
Limit Comparison Test. Since the p-series diverges for p = 1 and converges for p > 1, we see that we must have 
deg g — deg f > 1 for the series to converge. 0 


Important: _ If f and g are nonzero polynomials such that ¢(k) # 0 for all positive integers k, 
then the series 


converges if and only if the degree of g is at least 2 more than the degree of f. 


There’s one more important convergence test for positive series, as we can see in the following example: 


Problem 7.18: For what values of c > 0 does yy o converge? 
n=1 


Solution for Problem 7.18: The denominator grows much faster than the numerator (and in particular the limit of 
the terms is 0), so we suspect that the series converges. 


Let a, = < be the n' term of the sequence. Looking more closely, we notice that the ratio between successive 
terms of the sequence is: 


ctl 


Qnsi _ Wat _ _€ 


an 5 web 1 


In particular, once we have n > c, the ratio of successive terms is less than 1. In other words, after the first c terms, 
the sequence “looks like” a geometric series with ratio less than 1. 


To be a bit more specific, let r = 7 <1. Then for all n > c, we have 


An+1 c P c 
An n+1 c+l1 


Hence, by induction, an4% < ra, for all k > 0. 
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Therefore, the series (after the first n terms) is less than a geometric series with common ratio r < 1, which we 
know converges. Thus, by the Series Comparison Test, our given series converges for any positive value of c. 0 


In fact, the series of Problem 7.18 converges to e° — 1; we'll see how to prove this in Section 7.6. 


Extending the example from Problem 7.18 to more general series gives us a new test, called the Ratio Test. The 
basic idea is that we try to see how close our series is to a geometric series of common ratio r. 


Important: The Ratio Test for series: Suppose }) a, is a series with positive terms. Let 
r= lim “™+ be the limit of the ratio of successive terms. 


n-co 
e Ifr <1, then the series converges. 
e Ifr > 1 (including r = 00), then the series diverges. 


° If = 1, then this test doesn’t give us any information. 


co co 
eon 4 ‘ : | 1 
It is important to note that the r = 1 case is inconclusive. For example, the series ob : and y" 2 both have 


n=1 n=1 
the limit of the ratio of consecutive terms equal to 1, but we know that the first one diverges whereas the second 


one converges. 


EXERCISES 


7.3.1 Determine if the following series converge or diverge. If possible, determine what they converge to. 


— n+l — 2 a 

(a) MF n2+3n+1 (b) Y eT (c) y; Bn Hints: 84 
iA n=l n=1 
= (nl)? fe BB () wins 

(d) i (2n)! Hints: 201 (e) yy sin : Hints: 120 


7.3.2 Modify Problem 7.12 to compare series whose terms are all nonpositive. 

7.3.3 

(a) Prove the convergence part of the Integral Test: suppose an infinite sequence {a,}” , is given by a, = f(n) for 
some continuous decreasing positive function f. If if. f(x) dx converges, then ))°, a; converges. 

(b) Prove, in the case of part (a), that 

yim < [ fooax = yi a. 

= k=1 


k 


nN 


Hints: 95 
7.3.4 
(a) Prove the Limit Comparison Test. Hints: 188, 37 


(b) Prove that if {a,}, {b,} are sequences with nonnegative terms such that lim ae 0, and ¥" b, converges, then 
q 8 3 8 


n-co in 


La, converges. Hints: 242 


(c) State and prove a result similar to part (b), but where lim oo. Hints: 203 
n-oo in 
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pot 
7.3.5x Determine if ti Homan CONVENES oF diverges. Hints: 41, 289 
n=1 


7.3.6x Prove the Root Test for series: Suppose .s ay is a series with positive terms. Let r = lim 4/a,. Then: 
n-0o 


e Ifr <1, then the series converges. 
e Ifr> 1 (including r = oo), then the series diverges. 


e Ifr=1, then this test doesn’t give us any information. 


Hints: 48, 146, 257 


7.4 ALTERNATING SERIES 


All of the convergence tests that we’ve developed so far are for series all of whose terms are nonnegative. The 
tests can be easily modified for series all of whose terms are nonpositive: just multiply all the terms by —1. But 
trickier are series that have both positive and negative terms. 


One easy way to tell if a general series converges is when the same series converges if we take the absolute 
value of the terms: 


Definition: We say that ¥, a, is absolutely convergent if )° |a,| converges. 


Absolute convergence is a stronger condition than convergence: 


Problem 7.19: Prove that if ¥ a, is absolutely convergent, then ¥. a, converges. 


Solution for Problem 7.19: Since all of our convergence tests from Section 7.3 deal with series with nonnegative 
terms, we write the terms in our given series as the difference of two nonnegative terms: 


An = (4n + |anl) — lanl 
for all n. We know that ¥° ja, converges by definition, and furthermore we have 
0 <4 + |an| < 2\anl, 


so by the Series Comparison Test, )°(a, + |a@n|) also converges. Thus, since a sum of convergent series is also a 


convergent series, we have 
Co ec co 
Yi a = VG + lanl) — 2. lanl, 
n=1 


and hence ¥' a, converges. 0 


For example, we now know that 


p De Se 
fore Ds 16.25 
converges, because it absolutely converges—that is, 
1+ C 7 4 7 dct Bd. ta 
4 9 16 25 
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(-1)"*1 


n2 


co 
converges. Thus, ‘ie 


n=1 


is absolutely convergent. 


However, there are series that converge but that do not converge absolutely: 


Problem 7.20: Show that the alternating harmonic series 


Solution for Problem 7.20: We can explore this by looking at the partial sums: 


8, =a, =1 
igh arith eek 
eee? 
eee pees a 
2. tot. O 
sp eeprigwoe beds 
6 4 Pf 
Vir Sih 
a> th eae tes Fe 
rl teith a ley J 
60 6 60 


We can represent this graphically on the number line, as in the diagram below. Each arrow is the result of 
adding the next term. 


We notice that the sums of an odd number of terms (1, 2, Z, bx .) are all greater than the sums of an even number 


of terms (0, 12,2... ‘) Furthermore, the odd sums are decreasing whereas the even sums are increasing, so 


since both sequences (the odd sums and the even sums) are bounded and monotonic, they each converge. Finally, 
because lim (S241 — S2n) = 0, the odd sums and the even sums must converge to the same value, which is the value 
n-co 


to which the entire series converges. 0 
The alternating harmonic series actually converges to log 2, but we won't be able to prove this until Section 7.6. 
The argument that we used in Problem 7.20 to prove that the alternating harmonic series converges can be 


generalized: 
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We can sketch a proof; it’s essentially the same as our argument in Problem 7.20 of the convergence of the 
alternating harmonic series. Suppose a; > 0 and a2 < 0 (the proof is essentially the same if they are reversed). We 
let Sy = a, +A. +-+-+4y be the sum of the first n terms. 


Note that s1,s3,$5,...is a strictly decreasing sequence: for example, note that s3 = s1 +42 +43, and the conditions 
imply that a2 +43 < 0. Similarly, sz, s4,5¢,... is a strictly increasing sequence. Further, each s; (with i odd) is strictly 
greater than each s; (with j even), and lim(s; — s;) = 0. 


Thus, the odd s’s converge downwards and the even s’s converge upwards, and where they meet is the limit 
of the original sequence. They must converge to the same value, since lim(s; — sj) = 0. 


The alternating harmonic series from Problem 7.20 is an example of a series that is convergent but that is not 
absolutely convergent. Such a series is called conditionally convergent. One thing we have to be very careful 
about is that if we have a sequence that is convergent but not absolutely convergent, then the order in which 
we sum the terms of the series is very important! For example, if we make the faulty assumption that we can 
“rearrange” the terms of the series, then we can arrive at the following bogus conclusion: 


Clearly the sum is not 0: indeed, we saw in Problem 7.20 that it is between 3 and 1 (in fact, it is log 2). So what 
we did must be illegal. 
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Bites al We can’t arbitrarily rearrange terms in an infinite series. This is why series 


are formally defined as sequences of partial sums: the order in which we 
sum the terms of a sequence to get the sum of a series is important! 


A remarkable fact is that we can rearrange the terms of a conditionally convergent series to get it to sum to 
any real number! So we have to be really careful when working with them. We don’t have such a problem with 
absolutely convergent series: any rearrangement of the terms will still converge to the same sum, although this is 
somewhat difficult to prove. 


EXERCISES 


7.4.1 For each of the following series, determine whether it converges or diverges. If it converges, determine 
whether it is absolutely convergent, and if possible compute the value to which it converges. 


= (-1) = (-1)' — (-1)' 
(a) (b) (c) 

= (-1)' log k = (-1)* — (-1)F 
(d) y 7 (e) Y= () ie 

k=1 k=1 ‘ k=1 


7.4.2 Show that if 2; a, converges, then 2, a> converges. 


n=1 n=1 


7.5 TAYLOR POLYNOMIALS 


Back when we first started studying derivatives, we saw that one important application is the tangent line 
approximation of the graph of f near x = a. The tangent line to the graph of f at the point (a, f(a)) has slope f’(a), 
so the formula for the linear function p(x) that approximates f(x) near a is 


p(x) = f(a) + f’(a)(x — a). 


This is called a first-order approximation of f at x = a, since p(a) = f(a) and p’(a) = f’(a). In other words, p matches 
f in both value and first derivative at x = a. 


A better and yet still relatively simple approximation would be with a quadratic polynomial. Of course, we 
will demand that our approximation be slightly better than the linear approximation—specifically, it should also 
match the second derivative at x = a. 


Let’s look at the special case where a = 0, since that’s simpler, and it will guide us to the general case. 


Solution for Problem 7.21: We want a quadratic polynomial, so we know that p is of the form 
p(x) = Cox* + 4X + Cp. 
We want to determine the coefficients so that p(0) = f(0), p’(0) = f’(0), and p’’(0) = f’’(0). 


We can just plug in x = 0 to get: 
co = p(0) = f (0). 
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So co = f(0). 


When we take the derivative, we have 
p’ (x) = 2cox + ¢1, 


and hence c; = p’(0) = f’(0). 


When we take the second derivative, we have 


p’ (x) ms 2c3, 
so we have 2c2 = p’’(0) = f’’(0), hence cz = 4 f’’(0). 
Therefore, our quadratic approximation is 
” 0 
p(x) = f(0) + f’(0)x + oe 


oO 


Quadratic approximations are typically more accurate than linear approximations, and are especially useful 
in situations when a linear approximation is no good, like the following example: 


Solution for Problem 7.22: We let f(x) = cosx. We could try a linear approximation: we have f’(x) = —sinx, so 
f’(0) = —sin0 = 0, and our linear approximation is 


p(x) = f(0) + f’(0)x = 1+4 0x = 1. 


That is, our linear approximation tells us that cosx ~ 1 near x = 0. This is interesting, but it doesn’t help us to 
approximate cos 0.1 unless we are willing to settle for cos 0.1 ~ 1. 


Instead, noting that f’’(x) = — cos x so that f’’(0) = —cos0 = —1, we try a quadratic approximation: 


a 0 
p(x) = f(0) + f’(0)x + Lor =1- st. 
This approximation gives us cos 0.1 ~ 1 — $(0.1)? = .995. 0 
If you do this on your calculator, you'll get cos 0.1 = 0.995004. .., so our approx- Ly 


imation is pretty good. We can also sketch the graphs y = cosx and y = 1 — }x*, at 

right, to see how they compare. The graph of cosx is the dashed black curve and 

the graph of 1 — $x? is the solid gray curve. Near x = 0, they are very close—in our 

sketch they are virtually indistinguishable—so this is a pretty good approximation x 
(and much better than the linear approximation y = 1). 


How do we generalize this to an approximation near x = a? 


Problem 7.23: Let f be a twice-differentiable function at x = a. Find a quadratic polynomial p such that 


p(a) = f(a), p’(a) = f’(a), and p’’(a) = f(a). 


Solution for Problem 7.23: Recall the formula for the linear approximation at x = a: 
p(x) = f(a) + f'(a)(x — a) 


The trick was writing the polynomial in terms of x — a rather than x. 
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So, using this same trick to get a quadratic approximation at x = a, we let 
p(x) = c2(x — a)? + e(x —a) + co, 
and solve so that p(a) = f(a), p’(a) = f’(a), and p’’(a) = f’’(a). 


We can do essentially the same calculations that we did in Problem 7.21. Plugging in x = a and solving 
p(a) = f(a) gives us 
co = p(a) = f(a), 


SO Co = f(a). 

Next, differentiating and plugging in x = a gives us 

cy, = p’(a) = f’(a), 

soc; = f’(a). 

Lastly, differentiating again and plugging in x = a gives us 

2c = p" (a) = f(a), 

$0 cz = $f’”’(a). 

Therefore, the quadratic approximation is 


2 


Note that when a = 0 this gives us our approximation near x = 0 from Problem 7.21. 0 


(x —a)’. 


p(x) = f(a) + f’(a)(x — a) + 


Important: The quadratic approximation of a twice-differentiable function f at x = a is 


p(x) = f(a) + f’(a)(x —a) + £@ (x — a)’. 


2 


Note that p agrees with f in value, first derivative, and second derivative at 
x =a. We say that p is a second-order approximation of f. 


The more general form of the quadratic approximation is useful when we want to approximate values far from 
x=0. 


Problem 7.24: Estimate log 0.9 using quadratic estimation. 


Solution for Problem 7.24: We set up a quadratic approximation at a = 1. Let f(x) = logx. Then f’(x) = 4 and 
f(x) = -+. Thus f’(1) = 1 and f”(1) = -1, and the approximating quadratic polynomial is 


7.) 


p(x) = FQ) + faye) +45 


(x -1) =(x-1)- 5 -1). 
Plugging in x = 0.9 gives 
p(0.9) = (-0.1) — (0.5)(—0.1)? = -0.005 — 0.1 = -0.105. 


Hence, log 0.9 ~ —0.105. Your calculator will say log 0.9 = —0.10536. . ., so again this is a pretty good approximation, 
much better than the linear log 0.9  —0.1. O 


It should seem reasonable that we can make a polynomial approximation even better by adding more terms. 
Again, to keep things simple, we'll start by examining approximations near 0. 
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Problem 7.25: We want to construct a degree n polynomial 


P(X) = CnX" + Cyix” | + +++ + e1x +0 


such that p(0) = f(0) for all 0 < k <n. (Note that f(x) = f(x) by convention.) 
(a) Forall0 <k <n, compute p™(0) in terms of the coefficients {cj}. 
(b) Solve for the coefficients so that p(0) = f(0) for allO <k <n. 


Solution for Problem 7.25: 


(a) Wecancertainly compute the k'* derivative p(x) as a polynomial in x, but it’s a bit messy, and we only need 
to compute p“(0). This means that we only need to determine the constant term of p(x). 


We know that taking the derivative of a monomial reduces the degree by 1. Therefore, the k"" derivative 
of a monomial will be a nonzero constant only if the monomial is c%.x*. Thus, only the c.x* term of p(x) will 
matter in p“ (0). The smaller-degree terms will vanish when we take the k'* derivative, and the larger degree 
terms will still have a factor of x, so they will vanish when we plug in x = 0. 


So we just need to compute the k" derivative of c,x*. The first derivative is kc,x*"!, the second derivative is 
k(k—1)cyx*-?, and so on, as each exponent from k down to 1 gets multiplied as a constant. So the kK derivative 
of cyx* is k!c,. Hence p™(0) = kc, for all 0 < k <n. 


(b) We need ck! = p™(0) = f(0), hence 


_ f° 
ie 
for all0 < k <n. (Note that 0! = 1! = 1, so this matches our earlier formulas.) Hence, the degree n polynomial 


approximating f(x) at 0 is 


"(0 ai) (n) 0 
px) = f+ fOx+ Oe. LOs,..., 20» 


QO 


We have thus seen that we can approximate f by a degree n polynomial at x = 0, provided that f is differentiable 
at least n times. We have a special name for these polynomials: 


Definition: Let f be an n-times-differentiable function. The degree n Taylor polynomial of f at 0 is the 
polynomial 


n 


we) = f0)+ fOx+ Pre + Pe 4.4 Oe 


+ £0) 


It is the unique degree n polynomial with the property that p(0) = f(0) for all 0 < k < n; that is, p is the 
unique degree n polynomial whose value at 0 and first k derivatives at 0 agree with those of f. 


We can easily do the same calculation at x = a (we will omit it here for brevity, but it is exactly the same 
calculation as in Problem 7.25, except we replace x with x — a throughout): 
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Definition: Let f be an n-times-differentiable function. The degree n Taylor polynomial of f at a (for any 
real number a in the domain of f) is the polynomial 


Ww “ n) 
pla) = fla)+ f(x a) + Oe a? + OG 993 4.4 Oe —ay 


It is the unique degree n polynomial with the property that p™(a) = f(a) for all 0 < k < n; that is, p is the 
unique degree n polynomial whose value at a and first k derivatives at a agree with those of f. 


Think of Taylor polynomials as a natural extension of our earlier concept of 
tangent line approximation. Indeed, the graph of the degree 1 Taylor polynomial 
of f at ais the tangent line to f at (a, f(a)). 


You will also sometimes see the term Maclaurin polynomial to mean the Taylor polynomial at 0, but this term 
less commonly used. 


Of course, we can only have a Taylor polynomial of f at a if all the derivatives that we need are defined. 


Let’s try a quick example: 


Problem 7.26: Estimate sin 0.2 using a cubic Taylor polynomial. 


Solution for Problem 7.26: Of course, we choose to construct the polynomial at 0. (Always try for 0 unless there’s 
a good reason not to.) 


We list the function and its derivatives at 0: 
f(x) = sinx f(0) =0 
f' (x) = cosx f'0=1 
f(x) = -sinx f’(0) =0 
f'" (4) = -cosx f'"(0)=-1 


At this point, don’t make the following silly, but regrettably common, error: 


Bogus Solution: The polynomial is 


p(x) = FO) + fO)x + f’ Ox? +f")? = x- x. 


Each of the coefficients has to be divided by the corresponding factorial. So the polynomial is 


f'O) 2 “ f"O) 3 see 13 
4 . 


p(x) = FO) + f'Ox+ : 


To finish, we plug in x = 0.2: 


1 1 149 
sin 0.2 ~ (0.2) — = (.008) = 5-750 ~ = = 0.198667.... 


The calculator says sin 0.2 = 0.19866933 ..., so this approximation is pretty close. 0 
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It is reasonable to ask the question: how good are the approximations given by Taylor polynomials? Given 
a function f, we can estimate the error of a degree n Taylor polynomial p of f at a. We define error in the most 
obvious manner: as the amount that the approximation differs from the actual value. 


For example, in Problem 7.26, the error in our approximation is 
E(x) = sinx —x + ae, 


and in particular, 


; 149 
E(0.2) = sin 0.2 — 750° 


We would like to be able to estimate the error. Fortunately, we have tools available for this, since the error 
function has a particularly nice property. Recall that we’ve rigged our Taylor polynomial p so that its first n 
derivatives agree with those of f at a. But what does that mean for E(x)? Since 

E®(x) = f(x) — p(x), 
we know that E(a) = 0 and that the first n derivatives of E vanish at a; that is, 
E(a) = E’(a) = E’’(a) = --- = E™ (a) = 0. 
And since p is a degree n polynomial, its higher derivatives are all 0. So in particular, we have 


E+) (x) eas fe. 


Suppose we are are using p to estimate f(c) for some c > a. We know that if f"*” is continuous, then |f("*” (x)| 
has some maximum value M on [a,c]. So 
IE“*Y(x)| < M. 


What happens if we integrate this n + 1 times? We get 


E@) << — (x - ay" < 


Gap (e-ay". 


_M_ 
(n+ 1)! 


Note that all of the constants of integration disappear while we do this, since all of the lower derivatives of E 
vanish at a. 


So our conclusion is: 
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Here’s an example: 


Solution for Problem 7.27: We have the quadratic approximation 


1 
p(x) = (x-1)- 5-1), 
which gave us 
p(0.9) = (-0.1) — (0.5)(-0.1)? = —0.005 — 0.1 = -0.105. 
So log 0.9 ~ —0.105. 


To determine the error bound, we need to look at f’’(x) = = on the interval [0.9,1]. This is maximized at 
x = 0.9, giving f”’(0.9) = Gop = Ap = 2-7435.... Thus, our error bound is 


2 
eee ae ae 
3 (-0.1)"| = 187 = 0.000457... 


Note that this is the absolute value of the maximum possible error, not the actual error. Indeed, a calculator will 
show that the actual error is about 0.000361. O 


EXERCISES 

7.5.1 Approximate V0.9 using a cubic Taylor polynomial. Determine the bound on the error. 

7.5.2 What is the estimate of (1 + €)* by a cubic Taylor polynomial, where n is a positive integer? 
7.5.3 Estimate e°! to 6 decimal places using a Taylor polynomial about 0. 

7.5.4 Let p(x) = ax* + bx + c be a quadratic polynomial. Find the degree 2 Taylor polynomial of p at: 
(a) x=0 (b) x=1 (c) x=rwhererisarootofp Hints: 171, 164 


7.6 TAYLOR SERIES 


If f is an infinitely-differentiable function, we can now approximate f by polynomials with arbitrarily high 
degree. We can “take the limit” of these polynomials to get an “infinite-degree polynomial,” called a power series. 


co 
In general, a power series is an infinite series of the form - c.(x — a)‘ for some sequence of coefficients {cx}. 
k=0 


Most often, we have a = 0, giving a power series of the form >» cx". If we let the coefficients be the coefficients 
k=0 
of a Taylor polynomial, then we get: 


Definition: The Taylor series at a of an infinitely-differentiable function f is 


= fOq) 
pe ibs (x “ay ay‘, 


for the values of x for which the infinite series converges. This is also called the Maclaurin series of f if a = 0. 
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Note that the first n + 1 terms of the power series representation of f at a give us 


exactly the degree n Taylor polynomial of f at a. So we can think of the Taylor 
series as the limit as n — co of the degree n Taylor polynomials at a (even though 
we have not defined what a “limit of polynomials” means). 


We'll start with what arguably is the most important example: 


Solution for Problem 7.28: We know that all derivatives of e* are themselves e*. So for all k > 0, we have 
f®(0) = e&° = 1. Hence the Taylor series is 


taxed BS ae 
a ki 


To determine the values of x for which this converges, we can use the Ratio Test, which is usually the most useful 
test for power series. The ratio of successive terms of the series is 


For any value of x, we have lim — = 0. Thus, by the Ratio Test, the Taylor series converges for all x € R. 0 


In Problem 7.28, we showed that yy a converges, but we don’t know what it converges to. We’d like to say 
k=0 
that ge 
Saltxt ata te, 


but how do we know that this is true, and that the power series doesn’t converge to some other value? 


What we do know is that 3 
x m5 
SOLAR a teste 


is the degree n Taylor polynomial approximation, and moreover we know that the error in this approximation is 


given by 
n 
(ieee Fee) Ps ee 
2! n! 


|En(x)| = Je" — <\qepr 


, 


where M is the maximum value of the (n + 1)* derivative of exp between 0 and x. This derivative is itself the 
exponential function, so M is e* (if x > 0) or e° = 1 (if x < 0), but the important fact is that M does not depend on n. 
Also, we just showed in Problem 7.28 that the series )°.5 % converges for all x, and thus by the Divergence Test, 
the terms of the sequence that we are summing must converge to 0. Thus 


ntl 


+1 fae: hs cal 
- (n+ 1)! 


=0. 


lim |En(x)| < lim 


o|(n + ah 


M lim 


That is, the error terms E,(x) converge to 0 as n — oo, and thus we can say that e* equals the sum of its power 
series: 


ee 
esltxtatayt 
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In particular, we have a proof of a fact that we have seen in the past: 


A function equals the sum of its Taylor series, where the series converges, if and 
only if the error terms converge to 0. Thus, if we can prove that the bounds of the 


error terms of the Taylor polynomials of a function f(x) converge to 0 as n — 00, 
then we have proved that the sum of the Taylor series is equal to f(x). 


The converse of the last sentence above is not true: a function could have the bounds of the error terms of its 
Taylor polynomials not converge to 0, yet the error itself converges to 0, and thus the function is equal to the sum 
of its Taylor series. However, examples of this are very difficult to produce. The main idea to keep in mind is that 
a Taylor series of f(x) does not automatically converge to f(x): it may converge to something else (see Problem 
7.50 for an example), or it may not converge. We have to do additional work to show that a Taylor series of f(x) 
actually sums to f(x). 


Naturally, the question of convergence of power series is central to using them effectively, so next we examine 
when power series converge. For simplicity, we'll initially consider power series at 0. 


Problem 7.29: Let 9 aecead be a power series, for some sequence {c,}*° . of coefficients, and let r,s be real 


=0 
numbers with |r| < is. Show that if the power series converges at x = s, then it converges absolutely at x = r. 


Solution for Problem 7.29: Since |cyr”| < |cns”| for all n, it would be easy if we could use the Series Comparison 
Test. Unfortunately, we don’t know that )' |c,s"| converges, since we only know that }° c,s" converges, not that it 
converges absolutely. So we must use a little trick to get around this. 

If }\ cns” converges, then the sequence {|c,s”|} must be bounded, so let B = sup{|c,s"|}. Then (noting that s # 0 
since 0 < |r| < |s|), we have |c,| < & for all n. This lets us bound |c,,r”| using |£|, as follows: 


|s|" , : 


But we know that a geometric series with ratio |:| < 1 converges. Therefore, by the Series Comparison Test, the 
series )° |c,r"| also converges, and thus the power series )' c,x" converges absolutely at x = r. O 


r 
s 


B r 
< —./7" = |: 
lr < it" = B((- 


Problem 7.30: Let 13 cnx" be a power series. Show that either the power series converges absolutely for all 


n=0 
x € R, or there exists some positive R € IR such that the power series converges absolutely for all x € (—R, R) 
and diverges for all x € (—o0, —R) U (R, 0»). 


Before proceeding with the proof, notice that the statement says nothing about what happens at x = R or 
x =-—R. At either of these values, the power series might converge absolutely, converge conditionally, or diverge. 
co 
ps CnX" converges at *} ; 


Solution for Problem 7.30: Let 
= {a 
n=0 


If S is unbounded, then for any r € R, there exists s € R such that |s| € S with |r| < |s|. By the definition of S, the 
power series converges at x = s. Hence, by Problem 7.29, the power series converges absolutely at x = r. Since 
r € IR was arbitrary, we conclude that the power series converges for all x € R. 
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Otherwise, S is bounded, and is nonempty since 0 € S (because every power series at 0 converges at x = 0.) 
Thus S has an upper bound; let R = sup S. For any r € R such that |r| < R, there exists s € R with |s| € S such 
that |r| < |s| < R (because otherwise, |r| would be an upper bound for S, contradicting the fact that |r| < R). By the 
definition of S, the power series converges at x = s. Thus, by Problem 7.29, the power series converges absolutely 
for x = r. Since this is true for any r € R such that |r| < R, we conclude that the power series converges for all 
x € (—-R,R). 


Finally, if r € R is such that |r| > R, then |r| ¢ S, so the power series does not converge at x = r. 0 


We also note that every power series of the form )) c,x" converges at x = 0. This is the case where R = 0 in 
Problem 7.30, for which the set S = {0}. Summarizing the result of Problem 7.30 gives us: 


Definition: The radius of convergence of the power series 2 CnXx" is: 
n=0 


e co if the series converges for all x € IR, 
e (if the series converges for x = 0 and diverges for all x # 0, or 


e Rif the series converges (absolutely) for x € (—R, R) and diverges for x € (—00, —R) U (R, 09). 


Happily, there is a straightforward way to determine the radius of convergence of a power series in terms of 
its coefficients: 


food . 
Problem 7.31: Let ae C,x" be a power series, for some sequence {Cn} of coefficients. Let 
n=0 
Cn 


100 |Cn41 


provided this limit exists. Show that R is the radius of convergence of )'c,x". (In particular, if R = 00, show 
that the power series converges absolutely for all x € R.) 


Solution for Problem 7.31: As in Problem 7.28, we use the Ratio Test to examine the convergence of the power 
series )' |c,x"|. We compute the limit of the ratio of two consecutive terms of the series: 


im Keeax*e 


Cn+1 
Cn 


|x|. 


n—-0o lenx"| n—0o 


If R = oo, then the above limit is 0 for all x € IR, and thus by the Ratio Test, the series converges for all x € R. If 
R = 0, then the above limit is oo for all nonzero x, and thus by the Ratio Test, the series converges only for x = 0. 
Otherwise, the limit is 
Sat by = 

Gy R 


If |x| < R, then this limit is less than 1, so by the Ratio Test, the series converges. On the other hand, if |x| > R, then 
this limit is greater than 1, so by the Ratio Test, the series diverges. 0 


lim 


no 


Note again that we get no information about what happens at x = R and x = —R. The Ratio Test is inconclusive 
and we need to check convergence directly at those two points. 


The above results all hold for general power series of the form a Cy(x — a)", except then we replace the 
n=0 
conditions |x| < R (or |x| > R) with |x — a| < R (or |x — a| > R). The results are: 
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Definition: The radius of convergence of the power series Sa Cn(x — a)" is: 
n=0 


e co if the series converges for all x € R, 
e Oif the series converges for x = a and diverges for all x # a, or 


e Rif the series converges (absolutely) for x € (a — R,a + R) and diverges for x € (—c0,a — R) U (a+ R, 0). 


Important: The radius of convergence of the power series “ Cn(x — a)” is 
n=0 


Cn 


n= 1Cn41 | 


provided this limit exists. In particular, if R = 00, then the power series con- 
verges absolutely for all x € R. 


Our next task is to develop a catalog of Taylor series of some common functions. The first class of functions that 
we mention is trivial: the Taylor series of a polynomial function is just the polynomial itself, since the higher-order 
derivatives of a polynomial function are all 0. This “series” clearly always converges to the original function. 


The next class of functions is the basic trig functions: 


Solution for Problem 7.32: We do sine first. The derivatives are 


sin Xx, COS X,—sinx,—CcOSX,SiNX,..., 
which cycle every 4 terms. So the values of these derivatives at x = 0 are: 
0/10, =1,0,.1,0; —2G,...-5 
repeating every four terms. Thus, we have the Taylor series 


2 Pee 


- a Sh a 


This can be written in summation notation as: 
Cc 
enti 
ie arn 
hd (2n + 1)! 


To determine the radius of convergence, we can’t directly use the Ratio Test, since every other term is 0. But 
we can still take the ratio of two consecutive non-zero terms: 


x2n+3 
(-1)""" Gay es = 
(-1)" na (2n + 3)(2n + 2) 


For any fixed x, the limit of this as n — co is 0. Thus, the Taylor series converges for all x € R. 
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n+1 


dxttl <1 


Also, we compute the error term for the degree n Taylor polynomial approximation. Note that sin x] < 


for all x, hence 
xntl 


IEn(x)| < (n+ 1) 


, 


which approaches 0 as n — oo. Therefore, the Taylor series converges to sin x for all x € IR, and we are justified in 
writing 


Bod x oy en 
tine gg LY ay 


for all x € R. 
There’s a similar pattern for cosine: 


2 ot xf  ieuall 
Savio Ta a Le Gee 


! 
2! n=0 


and this series also converges for all x € IR. (The reasoning is essentially the same as that for sine, so we will omit 
the details.) O 


One really clever thing to note is that, if we extend our use of power series to 

complex numbers (which we can do, although we have not proved this), then the 

power series of exp, sin, and cos give us a proof of Euler’s Formula! We write 
ee ee. 


iat Sh tie ee 3I 4! 


We know that the powers of icycleas 1, i, —1, -i, 1, i, -1,-i,1,...,sowecan simplify: 
a 


i ee eee See Fr eas 


Separate the real and imaginary parts, and we’re done: 


ea (1-F454-|eie-F +5 


2! 
= cosx +isinx. 


3! 


So we have “proved” Euler’s Formula: 


e* = cosx + isinx. 


We continue building our catalog of power series: 


Problem 7.33: Find the Taylor series for log x about x = 1, and determine where it converges. 


Solution for Problem 7.33: Note that we can’t construct a power series of log x about x = 0 since the function is not 
defined there! Instead, x = 1 seems like the best location, since log x and its derivatives are easy to evaluate at 
x=1. 
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We start by listing the derivatives of log x: 
f’(x) = 1/x, 
f(x) = -1/x*, 
f(x) = 2/x?, 
f(x) = -6/x*. 
We can see the pattern in the derivatives of log: 


n+ (1 — 1)! 
) "eat a 


fF) =(-1 


. Thus, the Taylor series of log at x = 1 is 


(n) _1)\n+ 
hence f(1) = (-1)"*¥(n = 1)! and Z » ~~ 
n n 
oo (1) 
i 3e- U4 x(x-1)°- ze- t+ 5Or- P= = i ae ; “(x1 


To determine the radius of convergence, we take the ratio of successive coefficients: 


This ratio has limit 1 as n — oo. So the radius of convergence is 1, and we can conclude that the series converges 
absolutely for x € (0,2) and diverges for x € (—c0,0) U (2, 00). The test doesn’t tell us anything about x = 0 or x = 2. 
However, for x = 0, the series is the harmonic series, which diverges, and for x = 2, the series is the alternating 
harmonic series, which converges. Thus, the Taylor series converges for x € (0, 2]. 


Next, we check if the Taylor series converges to log x for x € (0,2]. The error term of the degree n Taylor 
polynomial is bounded as 
rt Zz) 


+1)! 


IEn(x)| < ie 


for some z between 1 and x. Note that f"*)(z) = (- ae If x > 1, then z > 1 and we have |f("*))(z)| < n!. Thus 


(x a aia 


< 
IEn(x)I $ |= 


, 


and since 0 < x- 1 < 1, we have that E,,(x) approaches 0 as n — oo. Therefore, the Taylor series converges to log x 
for x € [1,2]. 


On the other hand, if x < z < 1, then |f"*)(z)| = 4; < 44, and 


1 ( *) De ain (t-1) 


If } <x <1, then ( 1_ 1) < 1, which means that E,,(x) approaches 0 as n — oo. Hence, we can conclude that the 
Taylor series converges to log x for x € [3, 1). Unfortunately, if 0 < x < }, then (2 - 1) > 1, and thus the bound 
for |E,(x)| grows arbitrarily large as n — 00. So we cannot conclude that the Taylor series converges to log x for 
XE (0, 3). However, this argument also does not prove that the series doesn’t converge to log x, because we are 
only computing a bound for |E,(x)|, and not E,,(x) itself. The fact is that E,(x) does approach 0 as n — ov for all 


Pal 


(n +) 


IEn(x)| < | ——— (x - 1)"""| = 
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x € (0,2], and thus the Taylor series does converge to log x, although this is quite difficult to prove for x € (0, 3) 
(in particular it requires a different method of error bounding for the degree n Taylor polynomial). 


In summary, 
1 1 1 1 1 (-1)K* 
log x = (x — 1) — =(x- 1) + =(x - 1)? —- =(x- 1) + =(x- 1)? --- = ) ~—(x- 1 
2 3 4 5 2 k 


for all x € (0,2]. O 
To make it look a little nicer, the above equation is more customarily written by shifting the variable by 1: 


log +2)= YP py SEE e, 
k=l 


The interval of convergence for this power series is (—1,1]. If we plug in x = 1, we get the alternating harmonic 
series, proving that 


1 7 
log2=1-5+35-qGt-- 


One really useful fact about power series is that we can differentiate and integrate them term-by-term. 


Solution for Problem 7.34: We start with 
ee ee oe 


sing = xX ata at 


When we take the derivative term-by-term, we get: 
We 


eae ae? 


which indeed is the Taylor series for cos x. 0 


It is hardly a coincidence that #(sinx) = cos x, and that the term-by-term differentiation of the Taylor series 
of sin x gives the Taylor series of cos x. You can also verify that taking the term-by-term derivative of the Taylor 
series of e* (which we know converges to e*) gives back the Taylor series of e*; this is consistent with £e* = e*. 


However, it is quite difficult to prove that this behavior generalizes, so we will have to state the relevant result 
without proof: 
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The above result helps us construct new power series easily. 


Problem 7.35: Find the Taylor series (at x = 0) of tan™! x. 


Solution for Problem 7.35: Rather than computing derivatives of tan~', there is an easier solution. We know that 


d 
— tan“! x = ——~, so we can find the Taylor series of ——, and antidifferentiate it. 
dx 1+x 1+ x 
Happily, the power series of ia is just a geometric series with common ratio —x. Specifically: 
1 6 538 
—— = 1-7 +x xo 48 -..., 
1+x 


So we take antiderivatives term-by-term, and we get 


- 
asl +} —J - eo _—_—_ oe oo eee 
tan x=C+x ak same dala 
Don’t forget about the C! We have to be careful about the constant. When we plug in x = 0, we see that 
C = tan"! 0 = 0, so the constant is 0. Therefore, 


tan! x=x-—+2>-THte. 
s oo 7 
To finish, we should determine the values of x for which this power series converges. The geometric series 
converges for —1 < x < 1, so the Taylor series for tan“! also converges in this region. But, the series for tan~' also 
converges at x = 1 and x = -1, since in either case it is an alternating series with decreasing terms that go to 0. So 
the interval of convergence is [—1,1]. 0 


WARNING!! _ Differentiating or integrating a Taylor series may alter convergence or diver- 


gence at the endpoints of the convergence interval. 


Note that the convergence of the arctan series at x = 1 gives us a neat formula: 


i ptpie A Bk 
Z a tan a1 375 7+ ; 


There is another formula for 7 that is presented in Section 7.A. 


EXERCISES 

7.6.1 Find the Taylor series, and their radii of convergence, for the following functions: 
(a) e* about x = 0. Hints: 80 

(b) sinx* about x = 0. Hints: 24 

(c) xe about x = 0. Hints: 245 

(d) -Vx+1 about x = 2. Hints: 30 


(e) ;4: about x = 0. Hints: 217 
7.6.2 Verify that the term-by-term differentiation of the Taylor series of cos x yields the Taylor series for — sin x. 


7.6.3 Compute the first 4 nonzero terms of the Taylor series of e* sin x at x = 0 in two ways: 
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(a) directly by computing the first five derivatives of f(x) = e* sinx; and 
(b) by multiplying together the Taylor series of e* and sin x. 


7.6.4x Find a function whose Taylor series is 


a 
4 


ape _ 


es 
3-2 5- 


for |x| < 1. Hints: 284, 221 


REVIEW PROBLEMS 
7.36 Define a sequence {an}, by a, = V2 and a, = V2a,-1 for all n > 1. Determine lim a,. Hints: 91 
n—oo 


7.37 For each of the following series, prove that it converges, and if possible, determine what it converges to: 


- 2k-1 = 3k + 4k = (-2)* = K23k 1 
@ )iaeet (b) in © Qiari (d) ies © Dip 


7.38 Find the first four terms of the Taylor series about x = 0 of the following functions: 


(a) e* (b) cosx? (c) et tsinz (d) d 
1 — 4x2 
oo. gn-1 
7.39 restart = A . Hints: 240 


1 
7.40 Estimate if sin x* dx to 3 decimal places using power series. Hints: 124 
0 


7.41 Letp and q be positive real numbers. Compute lim </p" + q”. Hints: 252 


—~ ni 
7.42 Letc>0bea real number. Prove that ye on converges if c > 4 and diverges if c < 4. (Source: HMMT) 
n=1 


7.43 Define a function f via the power series 


s 
Th 


3 co 
f(x) =14+> joa = Yap 


for -1 <x <1, Compute eb /@)4*, (Source: HMMT) Hints: 59 


CHALLENGE PROBLEMS 


7.44 Suppose that the sequence {an}*°, satisfies 0 < dy < don + A2n+i for all n = 1. Prove that Yan diverges. 


n=1 


(Source: Putnam) Hints: 58, 21 


n 


7.45 Define a, = bs | —logn. 


i=1 
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(a) Use appropriate definite integral(s) to prove that 0 < an41 < da, for alln > 1. Hints: 2, 159 
(b) Use (a) to prove that lim a, exists. 
nc 


Note: lim a; is called Euler’s Constant and is usually denoted y. It is an important number in analysis, but not 
n—-oo 


much is known about it: indeed it is not known whether y is rational or irrational, although it has been computed 
to 2 billion decimal places. 


7.46 Define a sequence {a,} by 
4 “les ee tS 
"(v= vet vez vn = (n — 1)? : 
Compute lim a,. (Source: HMMT) Hints: 27, 15, 235 
n—oo 


7.47 The Fibonacci numbers are defined as ap = a; = 1 and ay = Ay—1 + An—2 for all n > 2. Let 


co 


f(x) =) ayx" = 14x +20? + 3x9 + Sixt + BF ++. 


n=0 


(a) Prove that the radius of convergence of f is at least 3. Hints: 222 
(b) Prove that if |x| < 5, then f(x) = 5. Hints: 264 


= 


7.48 The Binomial Theorem works for non-integer exponents too! Specifically, we can write 


asap =r4(the+ (Phe + (he +, 


even when p is not a positive integer. Explain this equation and prove it. What are the necessary conditions (if 
any) on x and p? Hints: 299 


7.49x Let n > 2 be an integer, and for any real number 0 < a@ < 1, let C(a) be the coefficient of x” in the power 
series expansion of (1 + x)*. Prove that 


1 Bon ig ; 
{ [c- - yp all dt = (-1)"n. 


k=1 
(Source: Putnam) Hints: 17, 270 
7.50x Consider the function 


_fexz ifx#0, 
fa) ={"5 if x =0. 


(a) Show that, for all positive integers n, there exists a polynomial p,, such that 


Pn (t)e= ifx #0, 


(n) = 
f @)={ 0 if x = 0. 


Hints: 66, 56 


(b) Using part (a), compute the Taylor series of f at 0, and show that for all nonzero x € R, the Taylor series 
converges to something other than f(x). 
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7.4 A STRANGE FORMULA FOR 7 


Problem 7.51: Show that 


Sidenote: The argument below was first shown by the mathematician Leonhard Euler in the 


S 18" century. 
Solution for Problem 7.51: We start with the power series for sine: 
x 
sinx=xX- ate, 
and we divide by x: 
sinx < 
JQ 8 i 
As we know, this converges for all x € R. 
We can make the substitution y = x* to define 
2 
sy)=1- +E... 


This still converges for all y € R. 


We now examine the zeroes of this power series; that is, we find the values of y such that g(y) = 0. The zeroes 
of f(x) are the integer multiples of 7, so the zeroes of g(y) are the squares of integer multiples of 7. That is, the 


zeroes of g(y) are exactly the elements of the set {n?, 4n*,97?,.. A 


This means—and here we are taking a lot of liberty and using some manipulations that we have not proved— 


that we can write 
¥ wae —— ( - 5)( - 5)( - 4). 
a 31 5! sy) =(1 72 d 4n? d On? ’ 


We now examine the linear (or y) terms on both sides of this equation. On the left side, it is clearly —2, but on the 
right side, it is the sum of the y-coefficients of the individual factors, giving 


-(S+ptaot) 
7? 42 972 : 


These must be equal, so we have 


Finally, we multiply by —7*, and we get 
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PLANE CURVES 


8.1 PARAMETRIC CURVES 


Let’s go back to our example of the circle of radius 1 centered at the origin of the coordinate plane. We can 
think of the circle as the set of all points (x, y) such that 


x=cos6@, y=sin6, 


where 6 € R. If we further restrict 0 € [0, 272), then each value of 0 gives a unique point on the circle. 
This concept—of describing the x- and y-coordinates of the points of a curve in terms of a new third variable 


(8 in our example)—can be made more general: 


Definition: A parametric curve C is a set of points in the coordinate plane satisfying the following: there 
exists an interval J C R and real-valued functions u and v, each with domain I, such that 


C = {(u(t), o(f)) | t € I}. 
The functions u and v are called parametric functions, and the pair (u,v) is called a parameterization of C. 


Usually I is a closed interval, so that the points of the curve C corresponding to the endpoints of I are the 
endpoints of the parameterization of C. Note that in the above definition, it is perfectly valid to have I = R. 


In our circle example above, we have u(t) = cost and v(t) = sint, where t ranges over the interval [0,27]. We 
could have used a different interval for the domain of the parametric functions, such as [0, 272), or [7,37], or even 
all of IR: these are different parameterizations (since the functions have different domains), but they all result in 
the same parametric curve, since they all produce the set of points consisting of the unit circle. 


First, let’s quickly observe that parametric curves really aren’t totally new: 
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Problem 8.1: Suppose f is a function such that Dom(f) is an interval. Explain how the graph y = f(x) of a 
function f is a parametric curve. 


Solution for Problem 8.1: We just let u(t) = t and v(t) = f(t) for all t € Dom(f). This gives us all points of the form 
(t, f(t)), which is exactly the graph of y = f(x). 0 


Problem 8.2: Are parameterizations unique? For example, are there other parameterizations of the circle with 


center (0,0) and radius 1? 


Solution for Problem 8.2: There are infinitely many parameterizations of a given parametric curve. For example, 
the parametric functions u(t) = cosct and v(t) = sinct give the same circle (centered at (0,0) with radius 1) for 
any nonzero real number c (assuming we adjust the domain interval of t appropriately). A slightly more exotic 
parameterization is something like u(t) = cos #? and v(t) = sin#’. In fact, more generally, we have u(t) = cos(f(#)) 
and v(t) = sin(f(t)) as a parameterization for any function f whose range includes an interval of length 27. 0 


So what exactly is the difference between these different parameterizations? The answer comes from how we 
think about parametric curves. We can visualize a parametric curve as a particle moving along a path, where the 
variable ft represents time. Thinking of this visualization, we can see how the parameterizations 


u(t) = cost, v(t) = sint 


and 
u(t) = cos 2t, v(t) = sin 2t 


differ: they give the same curve, but in the second parameterization, the particle is moving twice as fast. It only 
takes 7 units of time for the particle to make a complete revolution in the second parameterization, versus 27 
units of time in the first parameterization. Since, as we have seen, many uses of calculus deal with how quantities 
are changing over time, it is very useful to have this temporal quality to parameterizations. 


Problem 8.3: How does the parameterization 


u(t) = cos(—t), v(t) = sin(—t)? 


describe the motion of a particle on the unit circle? 


Solution for Problem 8.3: Note that if x = cos(—t) and y = sin(—t), then for any t we have x* + y* = 1, so this curve 
is the same circle with center (0,0) and radius 1 that we have been considering so far. However, with this new 
parameterization, we think of the particle moving around the circle in the clockwise direction. For example, at 
t = 0, the particle “starts” at (cos 0, sin 0) = (1,0), and att = 3, the particle is at the point (cos(-4), sin(-3)) = (0,-1), 
so the particle has moved one-quarter of the way around the circle in the clockwise direction. 0 


For simplicity of writing, we often would write the parameterization of the circle 
from Problem 8.3 as 


(cos(—t), sin(—#)). 


It is understood that the “coordinates” are actually parametric equations describ- 
ing the path of the curve. 


Problem 8.4: What are parametric equations of the line passing through (xo, yo) with slope m? 


Solution for Problem 8.4: We can think of the line “starting” at (xo, yo), with the particle moving m units in the 
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y-direction for every unit of motion in the x-direction. Thus, possible parametric equations are: 
u(t) = xo + t, v(t) = yo + mt, 
forte R.O 


One common use of parametric equations is to describe the motion of an object that is a “sum” of two (or 
more) separate motions. A classic example of this is the following: 


Problem 8.5: A wheel of radius 2 feet is moving along a straight path at 1 revolution per second. At time 
t = 0 sec, the point at which the wheel touches the ground is painted yellow. Determine parametric equations 


for the path of the yellow point. 


To show more clearly what’s going on, we can sketch a picture of the motion: 


10000 


The wheel starts at the position on the left and rotates clockwise. The later positions of the wheel (in 1/8-second 
increments) are shown. The yellow point that we are tracking is the dark dot (a portion of the wheel containing 
the dot is shown in darker shading to add a little contrast). The dashed arc is the path. 


Solution for Problem 8.5: We impose a coordinate system, where we set (0,0) to be where the point starts. The idea 
is to combine the two separate motions that are present: the wheel moving down the path and the point rotating 
around the center of the wheel. 


We can start by writing parametric equations for the center of the wheel. The center moves, in 1 second, a 
distance equal to the circumference of the circle; thus, the center moves 47 feet per second. Hence, a parametric 
representation for the center of the wheel is (47,2). (Again, this is really the parametric equations u(t) = 4mt and 
v(t) = 2, and the curve traced by the center of the wheel is given by (u(t), v()).) 


Now we determine the position of the dot relative to the center of the circle. The dot is rotating in the clockwise 
direction, one revolution per second, starting at the point (0,—2). Many answers are possible, but perhaps the 
simplest is 

(—2 sin 2nt, —2 cos 27t). 
Pay close attention to how we combined all the data about the rotation to come up with these functions. The 27t 
term gives the proper speed: the dot makes one complete revolution of the circle every second, since sin 27t and 
cos 27t each have period 1. Also, the factor of 2 gives the correct radius, and swapping the sine and cosine terms 
and multiplying by —1 gives the correct starting point at t = 0 and direction of movement. As a check, plugging 
in t = } gives the point (—2,0), which is where the point should be relative to the center after a quarter-circle 
clockwise rotation from the starting point (0, —2). 


To get the overall position of the yellow dot, we add together the parameterizations of the motion of the center 
of the wheel along the road and the motion of the dot around the wheel, giving: 


(Ant — 2sin 2mt,2 —2cos2nt). 


We can sketch the graph of this as t runs from 0 to 3, giving 3 revolutions of the circle: 
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O 


The curve from Problem 8.5 is called a cycloid. We also get an interesting picture if the point is in the interior 
of the circle: 


Notice the “corners” are not as sharp. We leave it as an exercise to determine the parametric equations for this 
curve. 


Our next goal is to compute the slope of a tangent line to a parametric curve. Specifically, we want to determine 
the slope of the tangent line to the parametric curve C, given by parameterization (u(t), v(#)), at the point (u(a), v(a)) 
for some a. 


There are a couple of ways to approach this. One is to think about what we mean by slope: the slope of the 
tangent line measures 
the rate of change of y 


the rate of change of x’ 


But what is the rate of change of y at (u(a),v(a))? It’s the rate by which the y part of the parameterization is 
changing, and that’s just v’(a). (Note that when we write this derivative, we think of v as a function of t, so we are 
computing v’(t) and plugging in the value t = a.) Similarly, the rate of change of x is u’(a). Thus, the slope of the 
tangent line at (u(a), v(a)) is ie assuming u’(a) # 0. 


u’(a) 


Another way to determine the slope of the tangent line toa paramet-  Y 
ric curve is to recall how we determined the slope of the tangent line to ~ 
the graph y = f(x) of a function f. We start with a secant line of our (u(a), ofa) 
parametric curve between t = a and t = a +h, as shown at right. We can 
write an expression for the slope of this secant: 


(u(a + h), v(a + h)) 


Changeiny — v(a+h) — v(a) 


Slope = Changetiiic. daadeen tla): 


To get the slope of the tangent line, we let the secant line approach the 
tangent line; that is, we let h approach 0: 
. V(a+h)—v(a) 
h0 u(a +h) — u(a) 

This does not exactly look like a derivative. But we can make it look more recognizable by multiplying and 
dividing by h: 

lim 

h-0 


v(a + h) — v(a) h 
h aca): 


This is clearly a as expected. 
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What happens if u’(a) = 0? Then the x-coordinate is not changing, so if the vertical coordinate is changing—that 
is, if v’(a) is nonzero—then we should get a vertical tangent line. This means that the particle is moving vertically 
(parallel to the y-axis) along the curve. 


Points where u’(a) = v'(a) = 0 may not have well-defined tangent lines. On the other hand, we expect that the 
slope of the tangent line to a parametric curve is a continuous function of the parameter—that is, the tangent line 
should vary “smoothly” if possible. Thus, at points where u’(a) = v'(a) = 0, we can try to use 


v7 
oa WE) 
as the slope of our tangent line. For example, the curve given by the parameterization (u(t), o(t)) = (f°, f°) is clearly 
the line y = x, so the tangent line at (0,0) should have slope 1. We have u’(0) = v’(0) = 0, but 


ot) “ee athe: 


t0 u(f) 1903 0 


d 
dy it 
dx” dx° 


In other words, the slope 3, 4¥ is the ratio of the derivatives 4u 4 of the parametric equations (in terms of f). We can 
remember this formula as canceling the dts,” even though this is only a notational device. 


Let’s go back to our cycloid: 


Solution for Problem 8.6: Taking the derivatives of the parametric equations gives us 


dx dy : 
ee 4n —4n cos 2nt, aT 4n sin 2nt. 
So the slope of a tangent line is 
dy % _  4nsin2nt sin 2nt 


dx = & 4n—4ncos2nt  1—cos2nt° 
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Note that if ¢ is an integer, this quantity is §. These are the “sharp corners” of the cycloid. At these points, we can 
consider the limit of this quantity, and attempt to use l’H6pital’s Rule. Specifically, for any integer n, 


sin2nt _,,  2ncos2nt 


m ——————_ _ = lim ————_ = lim cot 2zt, 
ton 1—cos2nt ten 2nsin2nmt ton 


which is undefined, so l’H6pital’s Rule does not apply. In fact, the limit is undefined. 0 


Sidenote: At the sharp corner points of the cycloid, we notice that 


sin2zt _ ease sin2nt _ 
tont 1—cos2nt ton- 1—cos2nt 


Since both +00 and —co each could be the “slope” of a vertical line, it is OK for 
us to say that the cycloid has vertical tangent lines at these points. This is very 
similar to our definition of vertical asymptote, in which we only required a limit 
of +co from either direction, and not the same limit from both directions at once. 


The slope of the tangent line gives us the direction of movement along the curve. Related to the direction of 
movement is the speed of movement: 


Solution for Problem 8.7: If the particle were moving horizontally as a function x(f) of t, then its speed would just be 
|\dx/dt|, the absolute value of the derivative x’(#). (Note that speed is always nonnegative, so we take the absolute 
value.) Similarly, if the particle were moving vertically as a function y(t) of t, its speed would be |dy/dtl. 


But our particle is moving in an arbitrary direction. So to calculate its speed, we break up the particle into 
its x- and y-components, and apply the Pythagorean Theorem. That is, the particle is moving at a speed of 
|dx/dt| = |u’(t)| in the x-direction, and a speed of |dy/dt| = |v’(t)| in the y-direction. Thus, to compute the speed 
of the particular in the direction of the curve, we form a right triangle with sides of length |u’(£)| and |v’()|, and 
compute the length of the hypotenuse of this triangle. 


Therefore, the speed of the particle at t = a on the curve parameterized by (u(t), v(t)) is 


v(u'(a)? + (o'(a))?. 


Note that at points where u’(a) = v’(a) = 0, the particle comes to a stop (that is, has speed 0). 


Problem 8.8: Once again, recall that the cycloid from Problem 8.5 was given parametrically by 


(4nt — 2sin2nt,2 —2cos2nt). 


At any time t, what is the speed of the point whose path forms the cycloid? 


Solution for Problem 8.8: The derivatives of the parametric equations are: 


d 
nd = 4n — 4ncos2nt, = = 4n sin 27t. 
dt dt 


So the speed is 


(4m — 4 cos 2nt)? + (4m sin 27)? 


264 


8.1. PARAMETRIC CURVES 


This simplifies to 


4n y(1 — cos 2nt)? + sin? 2nt 
4n V1 —2cos 2nt + cos? 2nt + sin? 2nt = 4n V2 — 2.cos 2nt. 


Finally, we can rewrite this as 


This equals 


1—cos2nt . 
87 Se ke = 87|sin 7tI. 
Again, when t is an integer, the speed is 0, and the point comes to a complete (instantaneous) stop. 0 
We can use speed to get the length of a parametric curve in the same way that we do for motion in 1 dimension: 
we integrate speed to get distance. 


Definition: The length of the parametric curve (u(t), v(t)) from t = a to t = bis 


- v(u'())? + (v'(OY at. 


One interesting thing to note is that while the speed depends on the choice of parameterization, the length 
does not, provided that the new parameterization “moves in the same direction” as the original parameterization. 
(You can explore the details of this statement and try to prove it as a Challenge Problem.) This should not be a 
surprise, as the length of a curve should not depend on how fast a particle is moving along the curve—the length 
should be an inherent property of the curve. 


We can check that our new definition of length matches our prior intuitive notion of length for a particularly 
important example: 


Problem 8.9: Verify the formula for the circumference of a circle. 


Solution for Problem 8.9: For simplicity, we can assume that the circle has radius r > 0 and center (0,0). So a simple 
parameterization is (r cos t,r sin t) for t € [0,27]. Thus, to compute the circumference, we integrate 


2n 
Vr? sin? t + r2 cos? t dt. 
0 


27 
This isjust rdt =2nr.0 
0 


To finish this section, we'll look at a more open-ended problem, where we're given a parametric curve and we 
want to find out as much as we can about it. 


Problem 8.10: Explore the astroid given by the parameterization (cos* t,sin® t). Sketch its graph, determine 


the slopes of its tangent lines, and compute its length. 


Solution for Problem 8.10: The first thing we notice is that the parameterizing functions are periodic with period 
2n. So taking t € [0,27] will give us the entire curve. When t is a multiple of $, we get the same points as the 
circle of radius 1 centered at (0,0), so the astroid passes through (1,0), (0, 1), (—1,0), and (0, —1). 


We can compute the slopes of the tangent lines by first computing the derivatives of the parameterizing 
equations: 


x d 
— =-3cos* tsint, = = 3sin’ tcost. 
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So the slope of a tangent line is 
dy 3sin’tcost __ sint Te Se, 
dx -3cos?tsint cost ; 


However, we have to be a little bit careful: since we canceled out sint cost in our above calculation, we actually 


d 
have points where oe = 
the x-axis or y-axis, and they may appear as “sharp corners” of our curve. 


= 0 whenever sint = 0 or cost = 0. These are the four points of the astroid that lie on 


Also, since the slopes of the tangent lines to the circle given paramet- ¥ 
rically by (cost, sin t) are 


we see that the astroid at time t has the inverse slope of the circle at time 
t. Thus, the astroid should like something look an “inverted circle,” as 
shown at right. 


R 


We can also compute its length: 


) a= ge omerrsrer t + 9 sin‘ t cos? t dt. 


We can pull out the common factors, but we’re careful not to forget the 


absolute value when doing so! 
27 
ii 3|cost sin t| Vcos? t + sin? t dt. 
0 


We know that cos? t + sin* t = 1, so we are left with: 


27 
i 3|cost sin t| dt. 
0 


The absolute value sign makes things slightly delicate. We’re perhaps safest using the trig identity costsint = 


} sin 2t, giving: 
3 2m 
3 { |sin 22| dt. 
0 


The easiest way to finish is to note that this is 4 times the value of the same integral from t = 0 tot = . On (0, a], 
we have sin 2t > 0, so we can remove the absolute value and we have: 


n/2 


7/2 
6f sin2tdt= —3cos2t| =6. 
0 0 


Thus the length of the astroid is 6. It is perhaps a surprise that the length is rational. 0 


What’s amazing (but not too hard to prove) is that the astroid is an example of a hypocycloid, which is like 
our earlier example of a cycloid, but instead traces the path of a point on a circle rotating inside another larger 
circle. As an exercise, you will prove that a particular hypocycloid is the astroid that we just explored. 


EXERCISES 


8.1.1 Write parametric equations for the following curves: 
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(a) A circle of radius 2 centered at (1,1). 


(b) A spiral centered at (0,0) where the radius increases at a constant rate of 4 units per revolution, so that the 
spiral passes through (in order): (0,0), (0,1), (—2, 0), (0, -3), (4,0), (0, -5), (—6,0),.... 


(c)* An ellipse with major axis of length 4 parallel to the x-axis, minor axis of length 2 parallel to the y-axis, and 
centered at (2,—3). Hints: 16 


8.1.2 Find parametric equations for the cycloid traced by a point inside of a wheel of radius 2, where the point 
starts (at t = 0) halfway between the center of the wheel and the ground, and the wheel moves to the right at 1 
revolution per second. Also determine the speed (in terms of f) that the point is traveling as it traces the path of 
the cycloid. 


8.1.3 Consider the curve given by the parameterization (cos t,t + sin t) for t € [0,77]. 


(a) Find the the slope of the tangent line to the curve at t = 4. 


(b) Find the length of the curve. 


8.1.4* Find parametric equations for the hypocycloid that is produced when we track a point ona circle of radius 
i that rotates inside circle of radius 1. Show that this curve is the astroid from Problem 8.10. Hints: 3, 131, 248 


8.1.5x Sketch the curve given by the parameterization (cos 3t, sin 5t). This curve is known as a Lissajous curve. 
Hints: 183, 75 


Sidenote: One more interesting formula, that we won't prove in this text, is a version of what 
is known as Green’s Theorem in vector calculus. 


Suppose that (u(t), o(f)) parameterizes a closed simple smooth curve from t = a 
to f = b in the counterclockwise direction. (“Closed simple smooth” essentially 
means that the curve starts and ends at the same point, doesn’t intersect itself 
except at its endpoints, and doesn’t have any sharp corners.) Then the area it 
encloses is 


; f° (u(tyo’(t) = w' (Hole) dt. 


We can see an example of Green’s Theorem with the circle of radius r: it has 
parameterization (r cos t,r sin t), so Green’s Theorem tells us the area is 


; cS “¢ cos t)(r cos t) — (—rsin f)(rsin t)) dt. 
0 


The integrand simplifies to just r?, so the area is mr”, as expected. 


8.2 POLAR COORDINATES 


You have likely seen polar coordinates before. Polar coordinates provide an alternative system for describing 
points in the plane, in terms of their positions relative to the origin. 
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Definition: A point P in the coordinate plane is represented in polar coordinates 
as (r, 0), where r is the distance - from P to the origin, and 6 is the angle between 
| the positive x-axis and the ray OP, where O is the origin, measured in the usual 
counterclockwise direction. The number r is called the magnitude of P and the 


angle @ is called the argument of P. 


The first task is converting between our usual “rectangular” coordinates and polar coordinates. 


Problem 8.11: 
(a) If P is given in rectangular coordinates by (x, y), what are the polar coordinates (r, 0) in terms of x and y? 


(b) If P is given in polar coordinates by (r, 0), what are the rectangular coordinates (x, y) in terms of r and 6? 


Solution for Problem 8.11: 


(a) The picture accompanying the definition of polar coordinates makes it pretty clear what 
we need to do. We just use basic facts about right triangles. We have r? = x* + y*, so 
r= x2 +y?. We also have tan@ = 2, so @ = tan™! (2) However, this only works for 


x 
points in the first quadrant, as in the diagram to the right. What happens if we have 
a point in the second quadrant, creating an obtuse angle? In this case, we still have 


r= +x? + y*, but we don’t have 9 = tan“! (2), since the range of tan“! is (-3, 4) and 


we have 0 € (3, m) in the second quadrant. 


y 


The formula @ = tan~!| =] only gives @ up to a constant +71, since tan™! only has the range (—2, #). For 
“ ys Pp y 8 2°72 


example, the point (—1,-1) in polar coordinates is given by r = V2 and 6 = =, but 3 # tan”!(1). We need 
to use our geometric knowledge of where the point lies to determine whether we need to add or subtract a 
y 


factor of 7. Another problem is if x = 0. Then 4 is undefined, so we cannot compute tan! (2) However, we 


know that we must have 0 = § or 2z, depending on whether y is positive or negative. 
(b) The other direction—going from polar to rectangular—is much easier: 
x=rcos@, 
y=rsin0, 
and there are no exceptions to this rule. 
O 


It will be convenient for us to allow for polar coordinates with negative r. We think of this as a distance 
“in the negative direction” of the ray that forms angle @ with the positive x-axis. This also means that, in polar 
coordinates, (r,@) = (—r, 9 + 7) for any r and 0. 


One perhaps unfortunate feature of polar coordinates is that any point has infinitely many different represen- 
tations. For example, the point (x, y) = (1,0) in rectangular coordinates can be represented in polar coordinates as 
any of the following: 


(1,0), (1, 272), (1,472), (1, -272), (1, —472), (—1, 72), (-1, 372), (-1, —72), (-1, —372),.... 
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Our main use of polar coordinates will be to describe the graph of curves of the form r = f(@), as in the 
following example: 


Solution for Problem 8.12: We can initially try to study this graph by plugging in a few common values of 6 and 
determining what the corresponding points are on the graph. 


For example: 


e Setting 6 = 0 gives r = asin0 = 0, so the (rectangular) point (0,0) lies on the graph; 

e Setting 0 = F gives r = asin 5 =a, so the (rectangular) point (0,2) lies on the graph; 

e Setting 6 = 7m gives r = asin = 0, so the (rectangular) point (0,0) lies on the graph (again); 

e Setting 0 = % gives r = asin ¥¢ = —a, so the (rectangular) point (0,2) lies on the graph (again). 


This is not much information. 


Let’s look a little more closely at the shape of this graph by considering what happens as @ varies from 0 to 
2n. At @ increases from 0 to $, the radius r increases from 0 to a. Then as @ increases from } to 7, the radius 
decreases from a back to 0. We may also note as 0 ranges from 7 to 27, we get exactly the same graph, since 
asin(@ + 7) = —asin(@). 


To determine exactly what the graph looks like, we can convert to rectangular coordinates. Noting that 
y = rsin@, we have > = sin@, so our equation is r = asin@ = at, or? = ay. (We need to be a little careful, as 
this doesn’t work when r = 0, but that point is just the origin, which we know is on the graph.) But we also have 
r =x? + y*, so we get the equation 
e+y =ay. 


To examine the graph of this, we complete the square: 


The rays in the picture above indicate different values of 0, and are there to illustrate how r increases as 6 increases 
from 0 to }, and then decreases as 0 increases from } to 7. 0 
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Solution for Problem 8.13: As in the previous problem, we can analyze what happens to r as we vary 0: 


e As @ ranges from 0 to 4, the radius increases from 0 to 2 and then decreases back to 0. 


Tl 


e As 0 ranges from % to 2#, the radius goes from 0 to —2 and then back to 0. 


e As @ ranges from 2 to 7, the radius goes from 0 to 2 and then back to 0. 

e Andsoon.... 

Sketching this gives us a 3-leafed rose, as shown to the right. Note that each 
range of @ between multiples of $ gives a leaf of the rose, but that the leaves 


overlap in pairs: for example, the leaf for 0 € [0, 3] overlaps with the leaf for 
6 € [n, 4]. 


To get a rectangular equation, we can use the triple-angle formula: 
sin 30 = 3sin 0 — 4sin’ 0. 


So we have 


We multiply by r and substitute r? = x” + y? to get: 


Fr ee 


x+y? 
To clean this up, we can clear the denominator: 
(x7 + y’)? + 2y’ - 6x*y = 0. 


O 


> 


More generally, the graph of r = asinn@, where n is a positive integer, is called a rose. As we saw in Problem 
8.12, a circle is a special case of a rose. Roses are good examples of curves that are ideal for polar coordinates: the 
rectangular equation for these curves are usually somewhat complicated, but the polar equations are simple and, 


more importantly, give us an idea of the nature of the curves. 


When exploring these sorts of graphs, you have to be a little careful about missing “hidden” points. For 


example: 


Problem 8.14: Is the polar point (1, 5) on the graph of r = cos 20? 


Solution for Problem 8.14: You might quickly say “no”: 


Bogus Solution: Plugging in 0 = $ gives r = cosm = —1, so the point (-1, a) is on the 


graph, not (1, x), 
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However, (1, 4) is on the graph! Remember that any point has infinitely many different polar representations. 
Since 0 = #2 gives r = —1, the point (-1, n) is on the graph. But this is the same point as (1, ), so (1, 5) is on the 
graph. 0 


Men G!__ Points can be represented in polar coordinates in infinitely many ways. 


The polar curve r = f(@) can be viewed as a special case of a parametric curve. Since x = rcos 0 and y = rsin9@, 
our curve is parameterized by 
(rcos 6,rsin 8) = (f(@) cos 6, f(@) sin 8), 
for the parameter 0. This allows us to use apply our parametric curve tools from Section 8.1 to study polar curves. 
For instance: 


Problem 8.15: Let r = f(@) be the equation of a curve in polar coordinates. What is the slope of the tangent 


line to this curve at any given value of 0? 


Solution for Problem 8.15: Treating our polar curve as a parametric curve, we can use the usual idea of 
dy _dy/d@ 


dx  dx/d0 
Thus, we evaluate dx/d@ and dy/d@ in terms of r and 0. 


We start with 
y=rsin@ = f(0)sin8. 


Then we use the product rule: 

d 

= = f(0)(cos 6) + f’(0)(sin 6). 
Since r = f(@), we will write r’ = f’(@), and thus 


OF on vida an 
an SS r Sig: 


Similarly, 
x =rcos@ = f(@)cos 0, 
so upon differentiating, we get 


dx : ; 
70 = f(@)(—sin 8) + f’(@)(cos 8). 
Thus 4 
x , 
a= -rsin6@ +1’ cos 0. 
Putting this all together, we have 
Slope of tangent at 0 = (Sn P a ae? 


—rsin@ +r’ cos@ 
O 


There’s really no point of memorizing this formula. It doesn’t come up that often, and it’s easy to rederive 
when you need it. The key thing to remember is: 
dy _dy/d0 
dx  dx/d0 


Problem 8.16: Find the slope of the tangent line to r = 3 cos 26 at 0 = 3. 
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Solution for Problem 8.16: First, we should get a rough idea of what the graph 
looks like. The cos 20 term has an interval of $ between zeroes, so the graph 
is a 4-leaf rose (the leaves do not overlap), as shown to the right. We have also 
drawn the ray corresponding to 0 = 77/3. At this point, you might jump to the 
wrong conclusion: 


However, it’s not that point at all! We plug in 6 = 3 and we get r = 3cos = = 
—}. The resulting point is in the 3rd quadrant, as shown below: 


In rectangular coordinates, it’s the point (-2. 2) 


Now, to compute the slope of the tangent line, we just use the formula from Problem 8.15: 


dy _3cos20cos @ — 6sin26 sind 


dx  -3cos20sin@ —6sin20 cos 0° 
We then plug in 6 = : 


2n Z _ 6sin 22 sin Z 
3cos $ cos $ — 6sin $ sin 3 


= 20 cin Z — 6 sin 22 w* 
3cos $ sin F 6 sin + cos 5 


So the slope is al and the tangent line is shown in the picture below: 
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O 


As we’ve seen, finding tangent lines to polar curves is a bit messy. This is not a big surprise, since lines 
themselves are not all that nice in polar coordinates, as we will see in the next problem. 


Solution for Problem 8.17: There’s really no easy way to do this except to do it first in rectangular coordinates and 
then convert to polar coordinates. The rectangular equation for the line is 


Y — Yo = m(x — Xo). 


We convert this to polar coordinates: 
rsin @ — yo = m(rcos @ — xo). 


To write this in the usual form of a polar curve, we solve for r: 
r(sin 8 — mcos @) = yo — mXo, 
giving 
sie Yo — MXo 
sin 0 — mcos@° 


There are other ways to describe lines in polar coordinates; some of them will be explored further in the 
exercises. 


EXERCISES 


8.2.1 For each of the following equations in polar coordinates, describe the graph, and find the slope of the 
tangent line at the given point. 


(a) r=acos@ + bsin@, where a and b are positive real numbers, with the slope of the tangent line at the point 
where @ = 0. Hints: 128 


(b) r=a+bsin9@, where a and b are positive real numbers, with the slope of the tangent line at the point where 
0 = $. (These curves are called limagons.) Hints: 285 


(c)* r=1-—sin 26, with the slope of the tangent line at the origin. Hints: 28, 19 


8.2.2 Find the equation in polar coordinates of the line with slope m that passes through the polar point (19, 00). 
Hints: 246 
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8.3 AREAS IN POLAR COORDINATES 


Suppose r = f(0) is a curve given in polar coordinates. We would 
like to compute the area of the “sector” between the curve and the rays 
given by the starting and ending angles 0 = a and 0 = b, as in the picture 
to the right. To compute this area, we can set up an appropriate Riemann 
sum of smaller regions whose areas we know how to compute. When 
we compute the area under a rectangular curve given by y = f(x), we 
divide the region into rectangles. However, rectangles are not a natural 
object to work with in polar coordinates, so instead we divide our region 
into a number of small circular sectors: 


Those regions may look like triangles, but they actually have circular arcs at the far end, so that they are sectors 
of a circle. Fortunately, areas of sectors are easy to compute: 


Solution for Problem 8.18: A circle of radius r has area mr”, and a sector of angle AO covers a proportion of = of 


the entire circle. So the area of the sector is $r°(A6). (Note that if AO = 27, then we get the whole circle of area 


mr.) 0 


Given that we can compute the areas of the sectors, we can sum up the sectors to get an approximation of the 
area of the region: 


Ai \2 
Li 5 (i) (A6;), 


where 1; is the radius of the i‘* sector, and A@; is the measure of the angle of that sector. This is a Riemann sum, so 
when we take the limit of this as the number of sectors grows, we get a definite integral. 


Summing up, we see the following: 


Important: © 


‘The area of the region between r = f(@) and the rays 0 =a and @=bis — 


me iL soe oe [core | 
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WARNING! —_—s The most common mistake is to use simply i (0) d0 for the area. To avoid 
* this error, remember that the area of a circle of radius r is mr?. This circle is 
given by the polar curve r = 1, and we can compute its area via the integral 


TT ins mae 
= is Cee 


A typical example is the following: 
Problem 8.19: Find the area of one leaf of the rose given by r = 3sin 20. 


Solution for Problem 8.19: As we've seen before, this is a 4-leaf rose. One leaf of 
the rose ranges from 6 = 0 to @ = 4. Thus the definite integral for the area is 


1 m/2 9 m/2 
= 9sin* 20d0 = — vi sin? 20 dé. 
2h 235 


Now use the substitution u = 20 to get 


TU 
; { sin’ udu. 


There are lots of ways to finish from here. For instance, we can now use the 
trig identity sin? u = (1 — cos 2u) to get 


9 Tl 
= i: (1 — cos 2u) du. 
8 Jo 
Note that the latter term integrates to 0 since the integral covers an entire period of cos 2u. Therefore, the area is 
9 
just a Oo 
A slightly more complicated example is the following: 


Problem 8.20: Find the area of the region that is the intersection of the interiors of the graphs of r = a0 +cos sts 
and r = 2(1 — cos 8). 


These curves are called cardioids. 
Solution for Problem 8.20: We can sketch a picture, shown at right. The curve 
on the right is r = 2(1 + cos @) and the curve on the left is r = 2(1 — cos 6). 
These curves intersect when 


2(1 + cos 9) = 2(1 — cos 8), 


so cos 8 = 0, which corresponds to points on the y-axis. 


One way to compute the area is to note that the graphs are reflections of 
each other across the y-axis, so the total area is just twice the area inside r = 2(1 + cos @) between 6 = $ and 0 = +, 
as shown below: 
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Even easier than that, because of the symmetry of the graph across the x-axis, the original area is 4 times the area 
inside r = 2(1 + cos 0) between @ = § and 6 = 7m (which is the top half of the shaded region in the diagram above). 
Thus, the original area is given by the definite integral 


re rPd@=2] (2(1+cosé))* dé. 
2 n/2 n/2 


Expanding the square in the integrand gives 


v6 


8 | (1+2cos 0 +cos* 0) dé. 
n/2 


Now we use the fact that cos? @ = (1 + cos 20). So our integral becomes 
773 i 
8 (5 +2cos@+ 5 60520) dé. 
n/2 pe 2 
This gives 


7 


3 . 5 
8 (50 +2sin0 + zsin26) 


n/2 


Hence, the area is 8 (# = ) = 6m — 16. As a check, note that this is about 2.85, which is plausible for a region that 
fills much of a rectangle with an approximate “height” of 4 and an approximate “width” of about 1. 0 


EXERCISES 


8.3.1 Find the area between the curves r = 0 andr = 20 for0 < 0< 7. 


8.3.2 Find the area of region that is outside the curve r = 1 — cos @ but inside the 
curve r = 1. 


8.3.3 Find the shaded area, shown at right, inside the limacon given by the graph 
of r=1+2sin 0. Hints: 269, 206, 272 


REVIEW PROBLEMS 
8.21 Write parametric equations to describe the curves traced by the following motions: 


(a) Aparticle tracing a circle with center (0,0) and radius 2, starting at (2,0) at time t = 0, moving counterclockwise 
with constant speed 1. 
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(b) Acannonball fired from (0,0) with initial velocity 100m/sec, shot at an angle above the ground, with gravity 
g = -9.8m/sec?. 


(c) A point starting at the bottom edge of a bicycle wheel (with radius 30 cm) that is rotating at 1 revolution per 
second, where the bicycle is moving forward at the rate implied by the rotation of the wheel. (For example, 
using meters as units, (0,0) is the starting point, and after 0.5 seconds the point is at (0.37, 0.6).) 


(d)x A particle tracing a circle with center (0,0) and radius 2, starting at (2,0) at time t = 0, moving clockwise 
with speed Vt. Hints: 106 


8.22 Imagine a string (of negligible thickness) unwinding from a fixed circular bobbin 
of radius 10, so that the string that is unwound is always tangent to the bobbin, as in 
the picture to the right. (Three positions of the string are shown in gray.) Assume that 
the string unwinds at the constant rate of one full loop of string per second. Determine 
parametric equations for the curve traced out by the end of the string, as shown by the 
dark curve at right. Hints: 115 


8.23 Let Cbeacircle of radius 3 centered at (0,0). Let D be a circle of radius 1 centered at 
(4,0), and let P = (3,0) be the point on D that is tangent to C. Find parametric equations 
for the epicycloid that is produced when we trace the path of P as D rolls around C, 
so that the center of D moves counterclockwise around C at a rate of 1 revolution per 
second. Hints: 255, 268 


8.24 The Archimedes spiral is the graph of r = 0 for 6 > 0. 


(a) Sketch the spiral. 
(b) Find the slope of the tangent line to the spiral at 0. 
(c) Find the area of region bounded by the spiral and to the left of the y-axis when 0 < @ < 27. 


8.25 Find the area of the region that is in the interior of the two cardioids r = 
1+sin@ and r=1+cos@. (The region is shown at right.) Hints: 253 


8.26 Prove that the graph of the polar equation 
; c 
Acos@+Bsin@ + = =) 
where A, B, C are nonzero real numbers, is a line. What is the geometric interpre- 


tation of the constants A, B, and C? And what happens if one or more of them is 
zero? Hints: 199 


CHALLENGE PROBLEMS 


8.27 Suppose a planet revolves around a star at distance R and the planet has a moon that revolves around the 
planet at distance r, with 0 < r < R. (Assume that all motion is in the same plane and that all orbits are circular.) 
At time t = 0, the planet is at (R,0) and the moon is at (R + r,0). The planet takes 1 year to revolve around the star 
and the moon takes } year to revolve around the planet. 


(a) Write parametric equations for the position of the moon at time ft (in years). (Assume all movement is 
counterclockwise.) As a check, at time t = i, the position of the moon should be (r — R,0) and at t = 1 it 
should be back to (R + 1,0). 


(b) Sketch some graphs of the moon’s path. Do this for different values of 7, and try to determine the conditions 
on ; that affect the shape of the graph. Hints: 70 


(c) Does the moon ever come to a “full stop” relative to the star? Hints: 209 
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8.28 Describe the graph of r = where € > 1 and ¢ is a positive constant. Hints: 88,5 


1+¢€cos@ 
8.29 Show that the length of a parametric curve is independent of the choice of parameterization in the following 
sense: 


Suppose that (u(t), o(f)) is a parameterization of a curve C for t € [a,b]. Let f be a differentiable function whose 
domain includes [c,d] such that: 


e f(c) =aand f(d) =b, 
e f([c,d]) = [a,b], and 
e f’(t) = 0 for all t € [c,d]. 


Show that C is the curve given by the parameterization (u(f(t)), v(f(t))) for t € [c,d], and that the length of C 
computed using the parameterization (u(f(t)), o(f(£)) equals the length of C computed using the parameterization 
(u(t), v(t)). Hints: 68, 96 


8.30 Consider the rose r = cosn@ where n is a positive integer. 

(a) Describe the graph of the rose (in terms of n). (You will have to distinguish between the cases where n is 
even and where n is odd.) 

(b) Compute the area of one petal of the rose. 


(c)x The width of the rose is the length of the longest segment, parallel to the y-axis, that can be inscribed in the 
first petal of the rose (that is, the petal that contains part of the positive x-axis). The example for n = 2 is 
shown below (the width is the length of the dark black segment): 


y 


Find the width of the rose r = cos 20. Hints: 52 


(d)x Determine what you can about the width of the rose r = cosn@. (You will not be able to find an explicit 
formula for n > 2, but see if you can find a relatively simple equation for the value of @ such that (cos 6, 0) 
and (cos n@, —@) are the endpoints, in polar coordinates, of the required line segment.) Hints: 74 
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DIFFERENTIAL EQUATIONS 


9.1 DEFINITIONS AND BASIC EXAMPLES 


A differential equation is an equation that relates a function, the function’s derivative(s), and an independent 
variable. This is not a totally new idea—we’ve already seen some simple types of differential equations. For 
example, 


y = 2 


is a differential equation that relates two variables x and y. Usually, in differential equations, we have a dependent 
variable (often y) that is understood to be a function of x or f or whatever our independent variable is. The context 
will usually make clear which variable represents the function and which represents the independent variable. 


As we know, to solve the differential equation y’ = 2x, we just compute the antiderivative of f: 
y= f Qxdx =x +C. 


Of course, this only determines y up to a constant C. 


Most differential equations involve both y and y’ terms. For example: 


Solution for Problem 9.1: In words, this equation asks: what function is equal to its derivative? One function that 
you probably immediately think of is y = e*, but that’s not the only answer. The complete answer is y = ce*, where 
c is a constant (possibly 0). A bit later, we will see how to show that there are no other solutions. 0 


Often, a differential equation will come with an initial condition. This is a condition of the form x = xo, y = yo 
for some constants xp and yo. This means that we are forcing the point (x9, yo) to be on the graph of our solution— 
we sometimes think of this as the “initial” point of our solution. Assuming that we are thinking of y as a function 
of x, we may write the initial condition as y(xo) = yo. This is a shorthand for y = f(x) and f(xo) = yo, where f is 
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some function. When writing differential equations, we generally are a little sloppy and use y to represent both a 
function and a dependent variable. 


Regardless of how we write the initial condition, we plug it into our solution in order to solve for the unknown 
constant. Continuing our basic example: 


Solution for Problem 9.2: From Problem 9.1, we know that solutions to y’ = y are of the form y = ce* for some 
constant c. We plug in x = 0 and y = 2 to get 2 = ce® = c, soc = 2. Thus, the solution is y = 2e*. 0 


A first-order differential equation involves y’ but no higher derivatives of y. The most basic form of a first-order 
differential equation is 


y= f(x,y) 
where f(x, y) is some function involving x and y. 


Many differential equations cannot be solved explicitly. However, we have other tools that we can use to 
analyze them. One geometric tool that we use to study first-order differential equations is called a slope field. 
Using a slope field involves thinking geometrically about what y’ = f(x, y) really means. Since one interpretation 
of y’ is as the slope of the tangent line to the graph of y, the equation y’ = f(x, y) means that the slope of a tangent 
line to the graph of y at the point (x, y) is f(x, y). 


A slope field is what results when we represent these slopes as small line 
segments on the plane. For instance, shown to the right is the slope field for the 
differential equation y’ = — a Ateach point (x, y), we draw a small line segment 
with slope —*. For example, at (3,2), we have a segment with slope —3. If y = 0 
(that is, along the x-axis), then ~< is undefined, so we have segments with 
“infinite” slope; that is, vertical segments. 


The key fact is: any curve that is the graph of a solution to y’ = f(x, y) must 
match these segments in slope. This allows us to sketch likely solution curves 
by tracing curves that “match” the slope field. For instance, in the differential 
equation y’ = =3 whose slope field is shown at right, the slope field suggests 
that the solution curves might be circles, as in the diagram below: 


Specifically, it looks like the solutions are given implicitly by the equation x* + y* = c for any positive constant c. 
However, this is only a guess based on our slope field. To be sure, we must check that these solutions are correct: 
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Problem 9.3: Show that the functions given implicitly by the equation x? + y? = c, where c is some positive 


constant, are solutions to the differential equation y’ = ies 


Solution for Problem 9.3: We can implicitly differentiate our candidate solution: 
2x +2yy’ =0, 
and solving for y’ we see that indeed y’ = == (where this is defined). 0 


An important hard theorem of analysis states that if f is continuous (although we haven’t really defined what 
“continuous” means for a multivariable function), then the first-order differential equation y’ = f(x, y) with initial 
condition y(x9) = yo has a unique solution. This is not too surprising when we think about it in terms of a slope 
field: if we are given the starting point (x0, yo), then the slope field tells us what the solution curve through (xo, yo) 
must look like. We simply trace our curve so that, at each point, the direction of the curve is given by the slope 
field. 


Thus, in Problem 9.3, since there is exactly one curve of the form x* + y* = c passing through any point of 
the plane, this theorem about uniqueness of solutions tells us that these indeed are the only solutions of this 
differential equation. 


Using slope fields is a fancy game of connect-the-dots. Here’s another example: 


Problem 9.4: Use slope fields to guess at the solutions to y’ = 7 Then, confirm that your guess satisfies the 


differential equation. 


Solution for Problem 9.4: Note that this is almost the same function as we had in 
Problem 9.3, except without the minus sign. We sketch our slope field: at each 
point (x, y), we draw a little line segment with slope +. We’ve drawn it somewhat 
“finer” than in Problem 9.3—meaning we've drawn more little slopes—so that we 
can better see the behavior. Notice the qualitative features of this slope field. Near 
the x-axis and away from the y-axis are places where |x| is relatively large and |y| is 


is large, and the slopes are near vertical. On the other hand, 


relatively small, so |: 
near the y-axis and away from the x-axis are places where |x| is relatively small 


and |y| is relatively large, so ° 


is small, and the slopes are near horizontal. 


We “connect the dots”—that is, we try to draw curves along the directions 
given by the slope field—and we get something like the picture to the right. These 
curves look a lot like hyperbolas. How can we check? 


We suspect the solutions are y* — x = c for some constant c. To check that these 
satisfy our differential equation, we implicitly differentiate: 


2yy’ — 2x =0. 


Solving for y’ gives y’ = 7, as we want. 


Our diagram appears to have a problem at (0,0), in that there are two lines crossing: the solution y = x and the 
solution y = —x. This is an apparent violation of the uniqueness of solutions through a given point. However, the 
function ' is not continuous at (0,0), so this result does not violate the theorem about uniqueness of solutions. 0 


Let’s do one more slope field example that looks slightly different than the previous two: 
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ys en - y= ay bec 


pebaeicn ae vec 


This is incorrect: indeed, we can plug the above “solution” into the original differential equation and see that it 
doesn’t work. What we should have written is 
y= [2-nax 


However, the last term is a function of y integrated as a function of x, so we cannot integrate it (with respect to x) 
unless we know beforehand how to express y in terms of x. 


We draw the slope field at right: at each point (x, y) we draw a little line with 
slope 2 — y. Note that the slopes only depend on y and not on x. We notice that 
the slopes seem to “funnel” the graph towards y = 2. We can thus sketch some 
solution curves, and we can guess that the graphs of the solutions look like the 
picture at right below. Intuitively, this makes sense: if y is far from 2, then y’ = 2—y 
draws y rapidly closer to 2. Then, as y gets closer to 2, the rate of change y’ gets 
small, so the y-coordinate of the curve moves very slowly towards 2. 


You might have a guess as to what equations have these curves as their graphs. 
The “decaying” behavior of y towards 2 as x — oo is our clue. The correct answer 
is that these curves are the graphs of y = 2 + ce~* for some constant c. Note in 
particular that c = 0 gives the constant solution y = 2. 


To check that our guess is correct, we compute y’ = —ce~*, and indeed we see 
that 


y =-ce* =2—-(2+ce*)=2-y. 


oO 


One important fact to note about drawing solutions curves on a slope field 
of y’ = f(x,y) is that the solution curves can never intersect. This is due to the 
theorem about the uniqueness of the solution of a first-order differential equation 
with an initial condition. Drawing a curve through a point of a slope field is the 
geometric equivalent of drawing the graph of the unique solution through that point, so by the uniqueness of such 
a solution, there can only be one curve through any given point, and hence the curves cannot intersect. Except 
(and there is often an “except” in calculus!), there may be multiple solutions at points where the function f(x, y) 
is not continuous. Such points (for example, the point (0,0) in Problem 9.3) may have multiple solution curves 
passing through them. 


i 
/ 
/ 
! 
! 
{ 


Besides slope fields, another way that we can study differential equations without actually explicitly solving 
them is by numerically estimating solutions. This is a very rich subject, one that is largely beyond the scope of 
this book. The simplest method of estimating solutions is to calculate repeated tangent line approximations: this 
is called Euler’s Method and is covered in more detail in Section 9.A. 
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Generally, the easiest method of solving differential equations is called separation of variables. Let's illustrate 
it with our basic example from Problem 9.3. 


Problem 9.6: Our goal is to solve y’ = +e 


d 
Write the equation as = = —*. Pretend that the “dx” and “dy” are variables, and rearrange so that all the 


x terms are on one side and all the y terms are on the other side. 
Compute the antiderivatives of both sides. 


Write the solution. Why do we need only one constant and not two (since we took two antiderivatives)? 


Solution for Problem 9.6: 
(a) We write the differential equation as 
es 
ay 
We move all the x terms to one side and all the y terms to the other side. This includes “multiplying by dx”: 
ydy = —xdx. 
(b) We can antidifferentiate both sides: 
a ydy =— f xdx 
This gives 
y_ 8 
—— 


(c) Cleaning it up a little gives x* + y* = 2C = c as expected. Notice that we don’t need a constant on both sides: 
each antiderivative is determined up to an arbitrary constant, so their difference is also determined up to an 
arbitrary constant. 


This general method is called separation of variables. It requires that our differential equation be separable, 
meaning that we can split all the x terms on one side and all the y terms on the other side, with “dx” on the side 
with the x terms and “dy” on the side with the y terms. To be more specific: 


Definition: A two-variable function f(x, y) is called separable if we can write 
F(x, y) = g@hty) 


where g(x) is a function that depends solely on x, and h(y) is a function that depends solely on y. 


We can use separation of variables to solve the differential equation y’ = f(x, y) 


if f is separable. We write oi = f(x,y) = 9(x)h(y) and rearrange the equation 


as 4 
y 
<> = g(x) dx. 
ity 8 
We then antidifferentiate both sides: the left side in terms of y and the right side 
in terms of x. 
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Note that we will always get a constant from our antidifferentiation. If we have an initial condition, we can 
plug it in and solve for the constant to get the unique solution. 


WARNING!! Even though we are treating dx and dy as variable terms, they are really just 
ra notational conveniences. What we really have when we write 


= g(x) dx 


ay 
h(y) 


is instead the equation 
1 dy eat 
) ps = g(x). 


Then when we integrate this equation with respect to x, the left side becomes 


a rey I Se 
f i) ax J ny!” 


where the last equality is an application of the Chain Rule. In other words, 
when we “multiply by dx,” we are really just setting up our equation to 
properly apply the Chain Rule when antidifferentiating. 


We can go back and revisit our basic example of a differential equation from Problem 9.1: 


Problem 9.7: Solve the differential equation y’ = y. 


d 
Solution for Problem 9.7: We write this equation as ~ = y, and use separation of variables to write as 


d 
ou =a: 
y 


[fa 
y 


giving log |y| = x + C. Exponentiating then gives |y| = e** = ce*, where c = e©. Note that c > 0 in this expression. 


Integrating both sides gives 


The absolute value sign is a bit annoying. We can get rid of it by absorbing any factor of —1 into the constant 
c. So we are OK with writing y = ce*. In our previous expression |y| = ce*, the constant c had to be positive. When 
we take away the absolute value sign, we allow c to be negative (or zero). 


Thus, any solution of y’ = y is of the form y = ce* for some constant c € R. 0 


Here’s another example: 


Problem 9.8: Solve the differential equation y’ = y — 2xy with y(1) = 2. 


Solution for Problem 9.8: We write the equation as 


dy 

de = y(1 — 2x). 
Then separate the variables: 

oY _ (1 ~ 2x) dx. 
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dy f 
—= | (1—2x)dx. 
17 : = 


log |y| = x-2x* +C, 


We can now integrate both sides: 


This gives us 


so we can exponentiate both sides to get 
lyl = er P4C = ot? 


(Note that we’ve adjusted the constant by letting c = e.) 


Again, we can get rid of the absolute value sign by absorbing any factor of —1 into the constant c, giving 
y = ce’, (Recall in our previous expression |y| = ce*-*’, the constant c had to be positive. When we we take away 
the absolute value sign, we allow c to be negative or zero.) 


Finally, since we are given an initial condition, we can solve for c. We plug in x = 1 and y = 2: 


So c = 2, and our solution is y = 2e*". 0 


We can graph this over a slope field to double-check that our answer looks 
correct. The slope field and the graph of y = 2e** are shown at right. We see 
the expected behavior of the graph. As x gets large, the function tends to 0. It 
has its peak at the value of x that maximizes x — x*, which is x = }. (The slope 
field is deceptive near the x-axis, since as y gets close to 0, the slopes get close to 
horizontal, but the slope field drawn is not fine-grained enough to show this.) 


One of the most important examples of a differential equation is y’ = ky, 
where k is some constant. In English, this means that the quantity changes at a 


rate proportional to the quantity itself. Another way to write it is = =k. This 
interpretation means that y is growing at a constant relative rate. 


‘ 
\ 
‘ 
\ 
\ 
\ 
\ 
\ 
\ 
\ 


There are many “real-world” phenomena that we can model with the dif- 
ferential equation y’ = ky. Some examples are: 


e population growth 


e spread of disease 


radioactive decay 


temperature change 


continuously compounded interest 


... or indeed any situation which is properly modeled with a constant relative growth rate. We'll look at some of 
these examples a bit more closely in a moment. 


First, let’s solve the equation! 
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Solution for Problem 9.9: We can solve this by separation of variables. Arrange the equation as 


d 
<2 = kdx, 
y 


and integrate to get log |y| = kx + C. Exponentiating gives 
lyl PS ekttC =, ce™, 


where c = e© > 0. Allowing y to be 0 or negative lets us remove the absolute value sign and have any c € R, so the 
solutions are y = ce for any constant c. 


As a check, we compute 
: d <4 EN, seni 
y= Z (ce) = K(ce) = ky, 
which is valid for any c € R. O 


Not surprisingly, this is called the exponential growth differential equation: 


Important: The solutions to y’ = ky, where k is a constant, are y = ce“ for any constant c. 


Before we look at specific examples, let’s examine how the solutions behave. If k > 0, then the quantity y 
grows without bound since jim e* = oo. On the other hand, if k < 0, then the quantity y tends towards 0, since 


lim e* = 0. Finally, if k = 0, then the quantity is a constant: the differential equation is just y’ = 0, and our 
x— 00 
solutions are y = c for a constant c. 


We can see this differential equation in action in some problems. 


Problem 9.10: A radioactive substance decays at a rate proportional to the amount of the substance present. 
Suppose we start with 100 g of Alpinium, a highly fictitious and toxic radioactive material. After 1 hour, we 


have 75 g of Alpinium remaining. How long will it take until we have only 10 g remaining? 


Solution for Problem 9.10: The first sentence of the problem describes our differential equation: it tells us that we 
have the equation y’ = ky for some k < 0. We know that k is negative because the quantity is decreasing over time. 
The solution to this equation is y = ce“ where c is some other constant. (Note that we use t instead of x, since the 
independent variable is time.) 


We don’t know k or c, so we might be ina little bit of trouble. However, fear not, for we are given two conditions: 
at t = 0, we know y = 100, and at t = 1, we know y = 75. Plugging in the first condition gives 100 = ce° = c, so 
c = 100. Now we have y = 100e“, and we can plug in the second condition, giving 75 = 100e*. Thus k = log(3/4). 


Hence our solution is that the amount of material at time t is y = 100e“°83/4)!, We could leave it like this, or we 
could write it as 
y = 100(e'°89/4)' = 100(3/4)'. 


But the first form is preferred to finish the problem. We want to know when the amount of material will be 10 g. 
So, we plug in y = 10, and we need to solve for ft: 


10 = 100¢"°83/4), 


This gives 
log(1/10) = (log(3/4))t, 
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and thus 
_ log(1/10) / 10) | —2.3026 


~ Jog(3/4) — —0.2877 
Thus, it will take slightly over 8 hours for our sample to decay to 10g. 0 


=~ 8.0039. 


Another basic example is temperature change: heating and cooling. Newton’s Law of Heating states that the 
temperature of a body changes at a rate proportional to the difference between the body’s temperature and the 
ambient temperature. Here’s an example: 


Problem 9.11: Eric’s office is kept at a temperature of 70°. Eric heats up his coffee in the microwave to a 
temperature of 200°. After 1 minute, the coffee has cooled to 170°. Eric is picky and will only drink his coffee 


if it is at least 120°. en ea coger iere So ee ee 
he will need to reheat it? bias 


Solution for Problem 9.11: You may notice that the units of the temperature measurements are conspicuously 
absent (although the context seems to imply that the temperatures are in Fahrenheit.) But do these units matter? 
Not really—all that matters is the difference between the coffee and the ambient (room) temperature. So, we let 
the variable y at any time ft be the difference between the temperature of the coffee and the temperature of the 
room. 


Hence the differential apes is our usual y’ = ky, with the two conditions y(0) = 130 and y(1) = 100. As we 
know, the solution is y = * for some constants c and k (with k < 0, since y tends to 0). 


Plugging in y = 130 and t = 0 gives 130 = ce® = c, so we have c = 130 and thus our solution is now y = 130e". 
Then, plugging in y = 100 and ¢ = 1 gives 100 = 130e*, so k = log(10/13). Thus the solution is 


y= 130¢e!08(10/13)E 
We want to know when y = 50, so we plug it in and solve for t. This gives 

50 = 130¢!08(10/13)t 
from which we get ft = log(5/13)/1log(10/13) ~ 3.64. Thus, Eric has only about 3.64 minutes to drink up. (He 
should be less picky.) 0 


These are just some of the real-world problems that can be simply modeled by the exponential growth equation 
y’ = ky. You will see others in the exercises. 


However, in many cases, the y’ = ky model for population growth is not quite accurate. For instance, most 
populations have external pressures (such as predators, scarcity of resources, etc.) that tend to slow down the 
growth rate as the population grows larger. For these populations, we can modify our differential equation 


slightly: 
y =ky-ay’. 


This is called the logistic growth equation. Note that y = 0 is a solution, but if y # 0, then we can write the logistic 
equation as 


/ 


y 
+=k-ay. 
y y 


Written in this form, we note that £ is the relative growth rate; that is, the growth rate as a fraction of the total 


population. In the regular exponential growth model, Y is constant. The new —ay term in the logistic growth 
equation is the “limiting” factor that slows down the growth as the population gets larger. 


Before solving the logistic equation, we might inquire about an equilibrium. This is a value c such that y = c 
is a solution. 
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Solution for Problem 9.12: An equilibrium solution y = c has y’ = 0. We plug these in to the equation and get 
0 = kc — ac? = c(k —ac). Thus c = 0 and c = k are the equilibrium solutions. 0 


When studying the logistic equation, we often let L = k equal the equilibrium, and rewrite the equation in 
terms of L. The equation then becomes: 
ae ( = +) 
7 k{1 ak 


Problem 9.13: Solve the logistic differential equation y’ = ky — ay, where k and a are positive constants and y 
is a function of t. ANE ReneS ee ALE: ciiiaiisindetausiad = 0, what happens to y as 


t— 00? 


Solution for Problem 9.13: We'll write the equation in terms of L as 


v= x(1- 2). 


Note that y = L is a solution: the right side of the above equation is 0, and the left side is also 0 since y’ = 0. If 
y # L, then we can separate the variables: 


dy a 
y(1 - 7) 
We want to integrate both sides of this, but first we can divide both sides by L to make it slightly nicer: 


The left side is now the reciprocal of a factorable quadratic, so we use partial fractions on the left side. Note that 
motes oh (: + +4) 
yL-y) L\y Ly} 


a (| 1 k 
frp} fr 


Now the (1/L) terms cancel, and after integrating we have: 


Our equation becomes 


log |y| — log |L — y| = kt +C. 


We consolidate the logs, and this becomes 


log y | kt +C. 
Next, we exponentiate, giving 
a a ae 
L-y 
where c = e© > 0. Allowing c to be 0 or negative allows us to eliminate the absolute value, leaving 
po SEO 
ies 
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If c = 0, then we get the solution y = 0 (we consider this a solution because it is a solution to the original equation 
y = ky —ay’). If c # 0, then solving for y, we conclude that 


y= 1+ ae t’ 


where a = c“! is anonzero constant. Finally, note that letting a = 0 recovers our equilibrium solution y = L. 


As t > oo, the exponential term in the denominator approaches 0. Therefore, y approaches L, the limiting 
population, as we expect. 0 


We'll finish this section with a harder differential equation modeling problem. 


Problem 9.14: Sometime Monday morning, it starts snowing in Los Angeles (!) at a constant rate. At noon, 
a snowplow starts to plow Hollywood Boulevard in L.A., and plows in a straight line at a rate inversely 
proportional to the amount of snow on the ground. The plow covers exactly twice as much ground between 


noon and 1 p.m. as it does between 1 p.m. and 2 p.m. (anal Heiner. ie oan coneiantt te ftuaghout), 
What time did it start snowing? 


Solution for Problem 9.14: It’s not clear immediately how to approach this problem, so we begin by trying to model 
the problem and listing all of the data that we know. 


We can model this situation by letting t be our time variable (in hours), so that t = 0 corresponds to noon. 
Suppose that T is the time that it starts snowing; note that T < 0. The amount of snow on the ground at time f is 
then r(t — T), where r is the hourly snowfall rate. 


Let the position of the snowplow along Hollywood Boulevard be given by s(t), with s(0) = 0. The fact that 
the distance covered in the second hour is one-half the distance covered in the first hour is represented by the 
equation 


s(2) —s(1) = 5(6(1) ~ (0)) = 5s(0), 


which simplifies to s(2) = 3s(1). We also have the stated fact that the plow moves at a rate inversely proportional 
to the amount of snow on the ground, so this gives the differential equation 


ds__ek 
dt r(t-T)’ 
where k > 0 is some constant. Integrating this gives 


s(t) = * logit -T|+C 


where C is some constant. Since we are only concerned with t > 0, we always have t— T > 0, and we can eliminate 
the absolute value signs. Next, plugging in t = 0 and using s(0) = 0 yields 


0'= : log(—T) + C, 
so C = —£ log(—T), and our equation becomes 
k k r=" 
s(t) = = (log(t ~ T) - log(~T)) = = log(——). 
Our last bit of data that we haven’t used yet is s(2) = 3s(1), so we plug that in: 


k T—2 
“ log(— 


)=s@) = 50) = 5-5 log( = *). 
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k 


The constant terms } cancel, and clearing the denominators we have 


210g(—=) = 310g(—+-). 


The constants outside the logarithms can be brought inside as exponents, so after exponentiating both sides we 
have 
ea _ lea 
oe PS 
This simplifies to T(T — 2)? = (T — 1)°, or T? - T- 1 = 0. The roots of this, by the quadratic formula, are 


1+ 5 
Mee 


- ¥5 


approximately 0.618 hours before noon, or at about 11:23 a.m. 0 


(ns 


Since T < 0, we choose the negative root, giving T = : x —0.618, and we conclude that it started snowing 


EXERCISES 


9.1.1 Sketch the slope field and some solutions curves for the following equations. Find the explicit solutions if 
you can. 


(a) y' =x(y-1) (b) y =x+y Hints: 111 () y= Hints: 112 (d) y=l+y 


9.1.2 Solve the following differential equations: 

(a) y=xy (b) y’ = xe¥ with y(0) =0 (c) y=y'sinx 
9.1.3 Solve y” = 2 where y(0) = 1 and y’(0) = 0. 

9.1.4 Solve y’ = 2y-— y* where y(0) = 1. 


9.1.5 Some milk is removed from the refrigerator at a temperature of 2 degrees Celsius and placed in a glass in 
a room that’s kept at a constant temperature of 20 degrees. After 1 minute, the milk has warmed to 5 degrees. 
When will the milk warm to 10 degrees? 


9.1.6* A homogeneous differential equation is an equation of the form y’ = f (:) for some function f. For 


2442 


example y’ = is homogeneous, since the right side can be written as } (: + +} Show that we can solve 
y 


any homogeneous equation by making the substitution y = ox. 


9.2 SECOND-ORDER LINEAR DIFFERENTIAL EQUATIONS 


A type of differential equation that comes up very frequently (especially in physics) is a 2nd order linear 
homogeneous equation with constant coefficients. That’s a mouthful—what that means is an equation of the 
form 

y” +ay’ +by =0, 


where a and b are real numbers. We can decode some of the adjectives: 


e 2nd order means y” appears in the equation, but not y’” or any other higher derivatives of y. 
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e linear means y”, y’, and y appear only by themselves as linear terms: nothing like y* is allowed. 
e homogeneous means there are no terms that involve only the independent variable. 
¢ constant coefficients means that a and b are real numbers and not functions of the independent variable (such 


as x or ft). 


To begin to study this type of differential equation, let’s first take the middle term out: 


Problem 9.15: Consider the differential equation y” + by = 0, where y is a function of x. 
(a) Ifb=0, what are the solutions? 


(b) If b <0, what are the solutions? (Try b = —1 at first if you’re stuck.) 
(c) Ifb>0, what are the solutions? (Try b = 1 at first if you’re stuck.) 


Solution for Problem 9.15: The nature of the solutions depends on whether b is positive, negative or 0. However, we 
expect that in all cases, solutions to second-order differential equations will depend on two arbitrary constants, as 
opposed to a single constant that we saw in solutions to first-order equations. One reason to expect this is that we 
may have to antidifferentiate twice to solve a second-order equation, and each antidifferentiation will introduce 
an arbitrary constant. We will also see why we get two constants as we work through each case below. 


(a) Naturally, b = 0 is the easiest. In this case, the equation is simply y”” = 0. We integrate this once to get y’ =c 
for some constant c, and then we integrate this again to get y = cx +d for some new constant d. More typically, 
we call the constants c; and c2, so that the solutions of y” = 0 are all of the form 


Y = CX + C2, 


where c; and c2 are constants. Once again, note that our solution contains two constants, since we are solving 
a second-order equation. 

(b) Next is the case where b < 0. It may be easier to determine the solutions if we write it as y” = —by. This looks 
very similar to the exponential growth equation y’ = ky, except that we now have y” on the left side instead 
of y’. Ina similar manner, we can search for an exponential solution of the form y = e** for some constant A. 
Then we have y’ = Ae** and y” = Ae**, so to satisfy y” = —by we must have A? = —b, or A = + V—b. (Now 
we see why we must have b < 0; otherwise, the square root does not exist. We treat the case of b > 0 in part 
(c) below.) For simplicity, let k = V—b, so that the original differential equation is y” — k*y = 0, and thus we 
have found solutions y = e**. 

Because differentiation is linear, any linear combination of our two solutions e and e* will be a solution 
to the differential equation y” — k*y = 0. To verify this, we simply compute, where c; and cp are arbitrary 
constants: 


y= ce* + coe ™ 


y! = keye* — kenge ™ 
y” =P oye* + Poe = hy. 


(c) The method from part (b) doesn’t work if b > 0, because V-b is not a real number. Instead, for this case, we 
can look first at b = 1 to get an idea. If b = 1, then we are looking for solutions to y” + y = 0. Happily, we 
already know two functions that are solutions to this equation: y = sinx and y = cosx. Moreover, we can 
prove that any solution is of the form y = c; sin x + c2 cos x for constants c; and cz, although we will leave the 
details of this as a Challenge Problem. 


To extend this argument to other values of b, we use essentially the same logic as in part (b). Each 
differentiation of y must introduce a factor of w = + Vb, so that differentiating twice will introduce a factor of 
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w* = b. Thus, the solutions of y” + w*y = 0 are of the form 
y = cy Sin wx + C2 COS WX, 


where Cc; and cz are constants. 


Let’s summarize the solutions to Problem 9.15. 


Important: 


In all of the following, k and w are positive constants, and c; and cz are arbitrary 
constants. 


e Solutions to y”’ = Oare of the form y = cx +c. 
e Solutions to y” — ky = 0 are of the form y = cye* + ce, 


e Solutions to y” + wy = 0 are of the form y = ci sinwx + Cc. coswx. 


Now let’s turn our attention to the more general form of the equation: 
y” +ay’ + by =0, 


where a and b are constants. Since exponentials figure prominently in solutions to many of the differential 
equations that we have already seen, let’s start our investigation there. 


Problem 9.16: Suppose that y = e” is a solution to y” + ay’ + by = 0, where r is a constant. What equation 
does r satisfy? 


Solution for Problem 9.16: We can immediately compute y’ = re and y” = r°e™. We plug these in to the differential 
equation and get 
re™ + are™ + be™ =0. 


Dividing by e” (which is never 0) leaves r? + ar+b=0.0 


Thus, we get an exponential solution of y” + ay’ + by = 0 of the form y = e™ if and only if r is a root of the 
characteristic polynomial A? + aA + b. More generally, if this quadratic has two distinct real roots r; and r2, then 
every solution to the differential equation is of the form 


y= ce"™* + c2e"* 


for some constants c; and cp. 


We have not actually proved that every solution to the above differential 
equation is of the form y = cye"* + cge"*. We have shown that these are 
solutions, but we haven't proved that there are no other solutions. In fact, 
those are all of the solutions, although this is difficult to prove with the tools 
we currently have. 


WARNING! 
*S 


If the characteristic polynomial A* + aA + b has a double root r, then every solution to the differential equation 
is of the form 


y = (cx + ca)e™, 


although we will leave it as an exercise to check that these are in fact solutions. 
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The remaining case is if the roots of the characteristic polynomial are not real. Not surprisingly, in light of part 
(c) of Problem 9.15, we get trigonometric terms: 


Problem 9.17: Consider the differential equation y” + ay’ + by = 0, and suppose that the roots of the 
characteristic polynomial A? + aA + b are r + si, where s # 0. Show that y = e”sinsx is a solution to the 


differential equation. 


Solution for Problem 9.17: Let y = e™ sinsx. We first compute y’ and y”: 


y’ =se™ cossx + re™ sinsx = e™(rsinsx + scos sx), 
y” =e™(rscossx — s” sinsx) + re™(rsinsx + s cos sx) = e™((r? — s*) sinsx + 2rs cos sx). 
This gives 
y” +ay’ + by = e™(((r —s”) + ar + b) sinsx + (2rs + as) cos sx). 


But we also know that the sum of the roots of A? + aA + b = 0 is —a, and the product of the roots is b. Therefore, 
2r = -a and r* + s* = b, which gives 


y” +ay’ + by = e™*(((2r° — b) + ar + b) sinsx + (—as + as) cos sx) = e*((—ar + ar) sinsx) = 0, 


as desired. 0 


A similar computation shows that y = e™ cos sx is also a solution to the differential equation (we will leave the 
details of this computation as an exercise). Further, any linear combination of solutions is also a solution. Thus, 
we see that solutions to y” + ay’ + by, where the characteristic polynomial has complex roots r + si, are 


y = e* (cy sin sx + C2 cos sx) 


for constants c; and c>. 


We can now summarize the solutions to second-order linear differential equations with constant coefficients: 


Important: Consider the differential equation 
y”’ +ay' + by =0, 
with characteristic equation A? + aA + b = 0. Inall cases, c; and c2 are constants. 


e If the characteristic equation has distinct real roots r; and rz, then every 
solution is of the form 
y = ce" + ce. 


e If the characteristic equation has a double root r, then every solution is of 
the form 
y = e*(cyx + cp). 


e If the characteristic equation has complex roots r + si, then every solution 
is of the form 


y =e (c, sinsx + C2 cossx). 


A common example of this sort of differential equation is the motion of a spring. 
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Solution for Problem 9.18: We can write Hooke’s Law as F = —ks, where F is force, s is displacement, and k > 0 is 
the constant of proportionality. We are also given Newton’s Second Law F = ma, and we know that acceleration 
is the second derivative of displacement. 


Putting this all together, we have the equation 


As we have just seen, the solutions to this differential equation are of the form 
s=c,coswt+c2sinwt. 


This does not quite look like what we want, which is A sin(@ + wt). But we can expand our desired answer using 
the sine angle-addition formula: 


Asin(@ + wt) = A(sin @ cos wt + cos @ sin wt). 


This does look like the solution to our differential equation, provided that we can solve c,; = Asin @ and cz = Acos@ 
for some 8. We must have A = ,/c? + c3, so we divide both sides of our solution by A: 


s ‘ 
A = < coswt + = sinat. 


Now we can choose @ so that sin @ = 4 and cos 6 = $, noting that sin’ 6 + cos? @ = 1. Then we have 


= sin@ coswt + cos @ sin wt. 


|e 


The right side is exactly the sine angle-sum formula, and we can finish: 
s= Asin(0 + at), 


as desired. 0 


EXERCISES 
9.2.1 Solve the following differential equations: 


(a) y” —4y’ + 3y = 0 where y(0) = 1 and y’(0) = 2. 
(b) y” +6y’ + 9y = 0 where y(0) = 0 and y’(0) = 1. 
(c) y”’ —4y’ + 13y = 0 where y(0) = 2 and y’(0) = -1. 
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9.2.2 Show that, if A2 +aA +b = 0 has complex roots A = r + si, then e™ cossx is a solution to the differential 
equation y” + ay’ + by =0. 


9.2.3. Show that, if A* + aA + b = 0 has a double root r, then e(c,x + cz) is a solution to the differential equation 
y” +ay’ + by = 0. Hints: 35 


9.2.4x Solve the differential equation y” + 2y’ + 2y = e~**. Hints: 85, 200 


REVIEW PROBLEMS 

9.19 Solve the differential equation y’ = 1 + y” with initial condition y(0) = 0. 

9.20 Solve the differential equation (x? + 1)y’ = (xy)* with initial condition y(1) = 2. 

9.21 Solve the equation y’ = 3y — y* with initial condition y(0) = 5, and determine lim y(t). 


9.22 Solve the equation 2y” + (y’)? = -1. Hints: 110 
9.23 Solve y” = y’ with initial conditions y(0) = 1 and y(1) = 2. 


9.24 Motion of an object through a fluid can be modeled by y’ = —ky” where k is a positive constant. Suppose an 
object starts at time t = 0 sec with velocity 10 m/sec and k = 0.5. Determine when the object slows to 1 m/sec. 


9.25 Wecan model the rate that a rumor spreads among a population with the differential equation ay =ky(1-y), 
where 0 < y < 1 is the fraction of the population that knows the rumor, and k > 0 is a constant. 

(a) Explain in words why this differential equation is a reasonable model. Hints: 4 

(b) Solve the equation. 


(c) If 10% of the population knows the rumor at noon on Sunday, and 20% of the population knows the rumor 
at noon on Monday, then approximately when will 90% of the population know the rumor? 


9.26 Solve the differential equation (sin x)y’ + (cos x)y = tanx. Hints: 250 


CHALLENGE PROBLEMS 


9.27 A not uncommon calculus mistake is to believe that the product rule for derivatives says that (fg)’ = f’9’. 
If f(x) = e*, determine, with proof, whether there exists an open interval (a,b) and a nonzero continuous function 
g defined on (a,b) such that this wrong product rule is true for x in (a,b). (Source: Putnam) 


9.28 


(a) Explain how the equation y’ = ay* — by, where a and b are positive real numbers, models a population with 
a birth rate represented by a and a death rate represented by b. 


(b) Solve the equation y’ = ay* — by, where a and b are positive real numbers. 


(c) Further suppose that y(0) = m. How does the behavior of the solutions depend on whether m > b/a,m < b/a, 
or m = b/a? Does this equation seem like a good model? 


9.29 A first-order linear differential equation is a differential equation of the form 
y’ + plxdy = qx), 
where p and gq are functions of x. 


(a) Solve y’ — 2y = e~* by multiplying both sides by e~ and integrating. 
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(b) Find a function h(x), in terms of p(x), such that (yh)’ = (y’ + py)h. Hints: 211 
(c) Multiply y’ + py = q on both sides by the function h from part (b), and solve the equation. 
Note: the function h is called an integrating factor for the linear differential equation. 


9.30 We prove that all solutions of y” + y = 0 are of the form y = c; sinx + cp cos x for some constants c; and c2, 
as follows: 


(a) Show that if f is a function such that f’’(x) + f(x) = 0 for all x, then the function 


g(x) = (Fo) + (FY 


is a constant function. Hints: 138 


(b) Using part (a), show that if f is a function such that f’(x) + f(x) = 0, and f(0) = f’(0) = 0, then f is the 
constant function 0. 


(c) Prove that if f is a function such that f’”(x) + f(x) = 0, then 


f(x) = f’(0) sin x + f(0) cos x. 
9.31 Let f : R — Rbea differentiable function such that f’(x) = f(1— x) for all x and f(0) = 1. Find f(1). (Source: 
HMMT) Hints: 263 
9.32 (This problem is closely based on a problem from [H-H].) 


Juliet is in love with Romeo, but Romeo is more fickle. The more Juliet loves him, the more he hates her, and 
the more she hates him, the more he loves her. On the other hand, Juliet is more sensible: the more Romeo loves 
her, the more she loves him, and the more he hates her, the more she hates him. 


(a) Explain why a reasonable model for their love for each other is 


where j and r denote their love (if positive) or hate (if negative) for each other, and where k is a positive 
constant. 


(b) Solve this system for r and j. Your answer will depend on k and two other arbitrary constants. 


(c) Suppose at time t = 0, Romeo is fully in love with Juliet (so that r(0) = 1), but Juliet is indifferent to Romeo 
(so that j(0) = 0). Write equations for the solution in terms of k. 


(d) Sketch the solution from (c), and explain in words what is going on. 


(e)* (If you want to explore further) How would things change if we had 


where k and / are positive constants? 


9.33% Find all real-valued continuously differentiable functions f with domain R such that for all x € R, 


(f(@))? = 2009 + i (Fey? + (FY?) at. 
(Source: Putnam) Hints: 20, 184 
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9.34x Find all continuous, infinitely differentiable functions f with domain R such that 


+y 
fa)fy = es fat 
x-y 


for all x,y € IR. (Warning: this is quite hard. First just play with the equation a bit, to see if you can discover 
any facts about such a function. Then try to find all the solutions where f is a polynomial. Then try to find some 
non-polynomial solution(s). Then try to prove that you’ve found them all.) Hints: 86, 172, 173, 114, 287, 38 


9.A EuLer’s MeTHOD 


As mentioned in Section 9.1, Euler’s Method is a method for approximating solutions to differential equations 
of the form y’ = f(x,y). Euler’s Method is basically just repeated tangent line approximation, as we shall see. 


Let’s go back to our first example from Problem 9.3 of y’ = ms with initial condition y(0) = 1. We know the 


x 


solution to this is the circle x? + y* = 1. When x = 0.3, we have y(0.3) = 1—- (0.3)? = v0.91 ~ 0.954. Let’s now see 
how we can estimate y(0.3) using only the differential equation y’ = sa 


Problem 9.35: Suppose a function y satisfies y’ = or and y(0) = 1. We wish to estimate y(0.3). 


(a) Using y(0) = 1 and the fact that y’(0) = 4 = 0, estimate y(0.1) via a tangent line approximation. 
(b) Using the value of y(0.1) that you found in part (a), compute an estimate for y’(0.1). 

(c) Using the values from parts (a) and (b), estimate y(0.2) via a tangent line approximation. 

(d) Repeat steps (b) and (c) to estimate y(0.3). 


Solution for Problem 9.35: Our method is basically to perform repeated tangent-line approximations. We can decide 
how accurate to make the approximation by choosing how many “steps” to do. Since we want y(0.3) and we’re 
starting with y(0), we'll try steps of 0.1, so we will end up doing three successive tangent-line approximations: 
first y(0.1), then y(0.2), and finally y(0.3). 


(a) Our first step is to apply our usual tangent line approximation, starting at the point (0,1) with slope y’ = 
—0/1 = 0. As you recall, the expression for the tangent line approximation at a is 
y(x) ~ y(a) + y'(a)(x — a). 


(If you don’t “recall” this, please go back to Chapter 4 and review it.) Since y’(0) = 0, our tangent line is just 
the horizontal line y = 1, and thus our first approximation is y(0.1) = 1. If we insist on applying the formula, 
we get 

y(0.1) ~ y(0) + y/(0)(0.1) = 1 + 0(0.1) = 1. 


(b) Now we’re at the point (0.1,1). We use the differential equation to get the slope for our next tangent line 
approximation, by substituting x = 0.1 and y = 1 to get y’(0.1) = —x/y = —0.1/1 = —-0.1. 
(c) Now we have y(0.1) and y’(0.1), so we can use these to perform a tangent line approximation of y(0.2): 


y(0.2) ~ y(0.1) + y/(0.1)(0.1) ~ 1 + (-0.1)(0.1) = 0.99. 


(d) Now we’re at the point (0.2, 0.99), so we substitute x = 0.2 and y = 0.99 into the differential equation to get 
y’ (0.2) = —x/y ~ —0.2/0.99 = —0.202. Thus, our final tangent line approximation is 


y(0.3) = y(0.2) + y/(0.2)(0.1) ~ 0.99 + (—0.202)(0.1) ~ 0.970. 


297 


CHAPTER 9. DIFFERENTIAL EQUATIONS 


So our final answer is y(0.3) ~ 0.970. 
| 


We recall the exact answer was approximately 0.954, so our estimate is not that great, but it is considerably 
better than a 1-step tangent line approximation starting at (1,0) would give us—that would give us y(0.3) ~ 1, 
since y’(0) = 0 would be the slope of our tangent line at y = 0. Not surprisingly, performing more steps (for 
example, increasing x by 0.05 between estimates instead of 0.1) would give a better approximation. 


Euler’s Method is just repeated application of tangent line approximation. 


Let’s see a slightly different example, where we don’t even have a differential equation to start with. 


Problem 9.36: Estimate e°? using a 3-step Euler’s Method approximation starting from e° = 1. 


Solution for Problem 9.36: To put this more in the context of Problem 9.35, we use the fact that y = e* is a solution 
to the differential equation y’ = y. 


We start at the point (0,1). The first tangent line approximation uses y’(0) = 1. So we have 
y(0.1) + 1+ 1(0.1) = 1.1. 
Now we're at the point (0.1, 1.1). We have y’(0.1) = y ~ 1.1, so we have 
y(0.2) ~ (0.1) + (0.1)y’(0.1) 1.1 + (0.1)(1.1) = 1.21. 
Now we're at the point (0.2, 1.21). We have y’(0.2) = y = 1.21. So we have 
y(0.3) = y(0.2) + (0.1)y’(0.2) = 1.21 + (0.1)(1.21) = 1.331. 


Thus, we conclude that e?? ~ 1.331. o 


Your calculator will say e°* = 1.349859. .., so this is not an especially great approximation. We can make the 
estimate of e”* more accurate by doing more steps to get from 0 to 0.3. Here’s the calculation for 10 steps: in each 
step, x increases by 0.03: 


% new y 


y ¥ 
0.000000 1.000000 1.000000 1.030000 
0.030000 1.030000 1.030000 1.060900 
0.060000 1.060900 1.060900 1.092727 
0.090000 1.092727 1.092727 1.125509 
0.120000 1.125509 1.125509 1.159274 
0.150000 1.159274 1.159274 1.194052 
0.180000 1.194052 1.194052 1.229874 
0.210000 1.229874 1.229874 1.266770 
0.240000 1.266770 1.266770 1.304773 
0.270000 1.304773 1.304773 1.343916 


This gives e°° ~ 1.3439..., which is better but still not too good. If we did 100 steps (which I won’t torture you 


by showing, but is a nice computer programming exercise if you are so inclined), we get e°° ~ 1.3493... which is 
within 0.0005 of the true value. 


298 


9.A. EULER’S METHOD 


299 


CHAPTER 9. DIFFERENTIAL EQUATIONS 


300 


EPILOGUE 


EPILOGUE 


Now WHat? 


Congratulations on completing the book! You should now have a good understanding of the fundamentals of 
single-variable calculus. What do you do now? 


Most likely, you are now at the stage of your career where the mathematics that you “have to” study is dictated 
by what subject you are going to pursue. For many paths of study, you may not “need” any more mathematics! 
But assuming that most readers of this book will be pursuing careers in mathematics, science, engineering, or 
some other technically-demanding field, there is probably more mathematics in your future. 


One very good option is to study physics. You may have already seen some physics, but physics without 
calculus is like language without verbs: it’s hard to do much physics without some basic calculus. For example, 
one of the most fundamental equations in physics is F = ma, but since a (acceleration) is the second derivative of 
position, what is F = ma if not a differential equation? Virtually all major concepts in physics hinge on calculus—in 
fact, it’s largely why much of calculus was invented! 


Another option is to study statistics. You've likely already seen some discrete statistics or probability, where 
you were analyzing finite sets of data or situations with finitely many discrete outcomes. Now that you know 
calculus, you can explore the much wider arena of probability and statistics of events with continuous outcomes. 
For example, many events can be modeled by a normal distribution, where an event with mean p and variance 
o* is modeled with the probability density function 


__! = 
JO 5 , 


Using this function, the probability that a < x < bis ds f(x) dx. (Note that f(x)dx = 1 by Problem 6.31, 


reflecting the fact that the probability of any outcome should be 1.) 


Within pure mathematics, there are a number of possibilities for the next course after single-variable calculus. 
The most obvious choice is to continue with multivariable calculus or vector calculus (these two are essentially 
the same material with slightly different nuances). This subject extends the tools of calculus that we developed 
in this book to functions of two or more variables. It also generalizes many of the techniques from Chapter 8 to 
work with curves and surfaces in 3-dimensional space. 


You could also embark on a course of study of differential equations. Our work in Chapter 9 barely scratched 
the surface of this very deep subject. 
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A third direction is to study linear algebra. Linear algebra, at its core, is the study of vectors and matrices, 
but it is much more than that. Also, many argue that having some background with linear algebra is preferred 
before beginning multivariable calculus or differential equations, because the tools of linear algebra are necessary 
in both of these subjects. 


Finally, we will repeat what we said in the Preface: we strongly recommend that students study discrete 
mathematics, which broadly includes combinatorics, probability, logic, and number theory. These subjects are 
often overlooked—especially in high-school curricula—but are vitally important and are broadly applicable to 
many “real world” problems. 


The overall point is that once you’ve completed a first course in calculus, mathematics is no longer “do A, then 
B, then C” like it might have been throughout high school. There are many different branches of math to explore, 
most of them interrelated. Don’t be in a rush to pigeonhole yourself—explore lots of different areas of math, and 
most importantly, have fun! 


EXPLANATION OF TOP-OF-CHAPTER DIAGRAMS 


The diagrams at the top of each chapter are graphs of Fourier series approximations. Specifically, the picture 
at the top of Chapter 1 (where 1 < n < 9) is the graph of 


sin((2k — 1)mx) 
y = 
2 acs 
As n — ov, this function approaches the square wave function: 
+ if Lx] is even, 
Voktte tusaods 
-3 if|x] is odd. 
The graph of this function is shown at the top of the previous page. 
The n' Fourier series approximation of a function f(x) with period 2L is 


F,,(x) = = B+D (recos ht +hsin 2), 


where the coefficients are given by: 


yr L 
m= 7 [fod m= Tf foaycos dx, b=  f sesin ae. 


If we extend this to an infinite series, we get the Fourier series for f(x). Part of the reason that the series converges 
is that the integrals above approach 0 as k — ov, due to the result from Problem 6.32. 


Just as Taylor polynomials compute the best polynomial approximations of a function at a point, Fourier series 
compute the best approximations of a function as a sum of sine and cosine functions. But whereas Taylor series 
give an approximation of a function at a single point, Fourier series are instead global in nature, considering the 
entire function at once, and thus are used to study the global properties of the function. 


Fourier series are useful for studying functions which are periodic in nature. Such functions arise in acoustics, 
cryptography, and signal processing. Fourier series are also essential for studying quantum mechanics and 
differential equations, and Fourier analysis even makes multiplication of large numbers easier. 
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In addition to the books cited below, there are links to many useful websites on the book’s links page at 
http: //www.artofproblemsolving.com/BookLinks/Calculus/links.php 


Includes links to the HMMT and Putnam competition, and MIT Integration Bee websites, Wolfram|Alpha, other 
online calculus sites, and any other websites that we think are cool. Also contains the errata list for this textbook. 


Before studying calculus, you should be ready. If you need to brush up on your precalculus topics, a good source 
is: 


[Ru] R. Rusczyk, Precalculus, AoPS Incorporated, 2009. 


Covers trigonometry, complex numbers, and an introduction to vectors and matrices. Lots of hard problems, 
including many problems from advanced US high-school math contests such as the AIME and USAMO. 


Next are a couple of “mainstream” calculus textbooks. Most widely-used college calculus textbooks (including 
the two listed below) come in many varieties: single-variable (like this book) vs. multi-variable (typically the next 
calculus course after this books) vs. both, and “early transcendentals” (where exp and log are introduced early, 
as in this book) vs. “late transcendentals” (where exp and log are not introduced until after differentiation and 
integration are defined). 


[H-H] D. Hughes-Hallett et al., Calculus (6th Edition), John Wiley & Sons, 2012. 


An above-average “mainstream” calculus textbook. The “Single Variable” edition is the version most closely 
aligned with this textbook. Light on rigor in spots, but a stronger emphasis than most books on using calculus 
to model “real-world” problems. 


[St] J. Stewart, Calculus (7th edition), Brooks Cole, 2011. 


By far the most widely-used college calculus textbook—reportedly sells more than all other calculus text- 
books combined. A competent book designed for typical first-year college calculus courses, with the usual 
advantages (lots of routine exercises and word problems) and drawbacks (not much rigor in places and not a 
lot of very challenging problems). The main version (combining single-variable and multivariable calculus, 
with late transcendentals) is 1368 pages long, and at press time retails for over $250. 
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Some more advanced calculus texts: 


[Sp] M. Spivak, Calculus (Third Edition), Publish or Perish, Inc., 1994. 


A very thorough, very rigorous treatment of calculus. Spivak covers many topics much more thoroughly 
than we do, and proves everything. If you want to read something more rigorous than this book, that really 
covers all the nuts-and-bolts of calculus, then you want to read Spivak’s book. His detailed construction of 
the real numbers is particularly illuminating. A new 4th edition was published in 2008. 


[Ap] T. M. Apostol, Calculus, Volume 1: One-Variable Calculus with an Introduction to Linear Algebra (2nd Edition), 
John Wiley & Sons, 1967. 


Another excellent rigorous treatment of calculus. A bit unconventional in that definite integrals are covered 
before differentiation. Also, as the title implies, has extensive coverage of beginning linear algebra, which 
leads into an introduction of vector calculus. (Volume 2 of this textbook covers multivariable calculus.) 


[MacC] C. R. MacCluer, Honors Calculus, Princeton University Press, 2006. 


A very compact (only 168 pages!) treatment of single-variable calculus. Generally assumes that students 
have seen calculus before, and fills in much of the rigor. Leaves a lot of results as “exercises” for the reader. In 
particular, the book has a very mature treatment of continuity, using topology rather than the 6-e construction. 


There are a number of essentially identical test-prep books published by various test-prep companies. These 
books are recommended for students who would like more practice with routine exercises before taking calculus 
placement tests for the purpose of potential college credit. These books may also contain test-taking tips. While 
such books are largely interchangeable, one of the more popular choices is: 


[Bar] S. O. Hockett and D. Bock, Barron’s AP Calculus, Barron’s Educational Services, Inc., 2008. 
Includes four full-length practice exams, a brief review of the topics on the exam, and test-taking tips 
(including tips on calculator usage). 


A definitive source for recent Putnam Competition problems is: 


[KPV] K. S. Kedlaya, B. Poonen, and R. Vakil, The William Lowell Putnam Competition Problems & Solutions: 1985- 
2000, Mathematical Association of America, 2002. 


All the problems from the Putnam during the period 1985-2000. Contains hints and very detailed solutions, 
including areas for further study. Warning: the majority of the problems on the Putnam involve branches of 
mathematics beyond calculus, and Putnam problems as a whole are very difficult. 
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HINTS TO SELECTED PROBLEMS 


HINTS TO SELECTED PROBLEMS 


. Letting u = x° + 3x? + x, what happens if we add and subtract du to the numerator? 


n+1 
. Compare the sum of the first 1 terms to if 2 dx. 
1 


. Draw a picture first. 
. The rumor spreads when sometime who knows the rumor tells it to someone who doesn’t already know it. 
. Try converting to rectangular coordinates. 


. There’s no “right answer.” Just try to explain what's going on as best as you can. Then read the explanation 


in the Solutions Manual. 


. Write 5 as 2x4. 
a 
. If f is even, what is the relationship between f f and ia f? 
a —b 


. Try substituting u = x. 
. Draw a picture first. 


. Let 4-dimensional space be represented by 4-tuples (x, y,z,w). The 4-dimensional sphere is the graph of 


e+y4+2tu*=l. 


. The velocity that the snowplow travels is inversely proportional to the amount of time since it started 


snowing. 


. Use integration by parts twice. 

. First prove it in the case where f(x) > 0 for all x € [a,b]. 

. Factor out +. Now what does the sum look like? 

. An ellipse is just a “stretched” circle. 

. Try it first for n = 2 to get a better idea of what’s happening. 


. Try taking the log of the sequence. 
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19; 
20. 
zi, 
22. 
23. 
24. 
25. 
26. 
27. 
28. 
29. 
30. 
31. 
32. 
33. 
34. 
36. 


36. 


S37. 


38. 


39. 


40. 
41. 
42. 
43. 


44. 
45. 
46. 


47. 
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To get the slope of the tangent line, you'll have to take a limit. You may also need 1|’H6ptial’s Rule. 
Differentiating both sides of the equation seems like a reasonable first step. 
Group the terms by powers of 2. 

Notice that f(x) — f(4x) = (F(x) — f(2x)) + (f(2x) — f(4x)). 

The statement is false. 

Don’t compute derivatives—start with the Taylor series for sin x. 
Substitute u = ¥x. 

Use cos? x = 1 - sin’ x. 

Try to make the sum look like a Riemann sum. 

Draw some points for easy values of 0 and connect the dots. 

What substitution makes the fractional exponents go away? 

Note it’s about x = 2. 

Slice the surface area into cylindrical strips. 

There’s no need to use l’H6pital’s Rule for the quotient of tangents. 

Prove by induction, using the fact that (fg) = ((fg))’. 

Use u = log x and dv = x dx. 


Use the facts that if r is a double-root, then a = —2r and b = r. 


One piece should be if aren dx. Evaluate this by the trig substitution x = tan 0. 


If lim Ot c # 0, then show that a, < 2cb, for sufficiently large n. 


a 


Some functions that work are 0, 2x, 2 sincx where c is any nonzero constant. There is one other family of 


c 
functions that works too. 


There are two basic ways to do this problem, and they might lead you to two very different-looking answers. 
Check that those two answers are the same. 


How can we easily extend the proof to arbitrary f? Just translate f up until its range on [a,b] is positive. 
The series looks pretty close to the harmonic series. 
You don’t necessarily need to solve for a and b. You just need to find f(1). 


Try to construct an example where lim f’(x) does not exist. Trig functions are usually the best examples of 
x-—0o 
functions that don’t have nice asymptotic behavior as x — oo. 


After simplification, the integral should no longer be improper. 
Try to find a useful linear combination of the given integrals. 
Try integration by parts with u = (1 — x)’ and dv = x*°° dx. What do you get? 


It may be easier to work in terms of coordinates rather than angles. 


. The three different behaviors are 4 < }, 4 = },and 4 >}. 


HINTS TO SELECTED PROBLEMS 


. Compare the series with an appropriate geometric series. 
. Divide through by the x first. 


. Bring the exponent out to the front of the logarithm, and then you should have something to which you can 


apply l’H6pital’s Rule. 


. Find 6 so that each of | f(x) g(x) — f(x)G| and |f(x)G — FG| are bounded by §. 
. Note that you’re trying to maximize y = rsin@ for 6 € [o, I. 

. Show that each set is a subset of the other. 

. Drawing a picture will help a lot. 

. You'll have to use some sort of Squeeze Theorem argument. 


. For x # 0, you will be able to compute the derivative just using the Product Formula. But at x = 0, you'll 


have to use the limit definition of derivative. 


. Start with a partial fraction decomposition. 
. Try to group the terms of the series is a useful way. 
. It’s not as hard as it looks. Just logically work through it step-by-step. 


. Since we’re starting with sin 2x, the double-angle formula may help. 
. It’s not as hard as it looks. Start with the usual idea of writing it as exp (tim log (xsi )) 
aed 


. Use |g(x) — L| < max{| f(x) — LI, |h(x) — L}}. 
. Note that the odd-power terms of f should cancel out. 
. The relevant equation is PV = k, where k is some constant. 


. Show that an odd-degree polynomial gets arbitrarily large as x gets large, and gets arbitrarily small as x gets 


small, or vice versa. 


. Prove it by induction. 


. Use the Law of Cosines to determine the distance between the tips of the hands in terms of the angle between 


the hands. 


. First, show that the new parameterization gives the same curve (as a set) as the original parameterization. 


. Look for a function whose derivative is a degree 4 polynomial that is a multiple of x? (so that it has a double 


root at x = 0). 


4’R 4 


. You can assume that Sam first rows in a straight line and then walks the rest of the way. (Why is this OK to 


do?) 


. Complete the square in the denominator. 
. In (x, y,z)-space, let one cylinder be given by y” + z* = 1, and let the other cylinder be given by x* + z” = 1. 


. The best you can do for most n is to get an equation relating cos(n — 1)@ and cos(n + 1)@. This equation can 


be solved for n = 3, but I wouldn’t try it without using a calculator or computer. 
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73: 


76. 


Vids 


78. 


72. 
80. 
81. 
82. 
83. 
84. 
85. 
86. 


87. 


88. 
89. 


90. 


91. 
92. 
93. 
94. 
95. 
96. 


97. 


98. 


99. 


100. 
101. 


102. 
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How many times will this curve touch the lines x = +1 and y = +1 before returning back to its starting point? 


As in part (a), you can assume that Sam first rows in a straight line and then walks the rest of the way. (Why 
is this OK to do?) 


Use integration by parts. 


Bound fF f(x)g(x) dx above and below by appropriate multiples of i g(x) dx. 
a a 


Use integration by parts. 

We know the Taylor series for e*; how is the Taylor series for e* related? 

Use a trig substitution. 

It seems convenient to look for a function such that t f(x) dx = 0 for alla e R. 
Look at a partition in which each interval is the same size. 

Break into a sum of two series that you know how to deal with. 

First, solve y” + 2y’ + 2y =0. 

First, try all the things in the remarks accompanying the problem. 

Use sin x = cos (4 = x). 

You could try to sketch an example first. 


Be careful about the range of sin™!. 


We know that lim = = 1. How does this help? 
Be 


Try letting b, = log, an. 

Do a = 0 and/or b = 0 as separate cases. 

Drawing a picture might help to see what's going on. 
Try the substitution x = u°. 

You can actually do both parts at once. 


This is just an (admittedly complicated) application of the Chain Rule. 


Ita<e<d<b,showthat [f= [r+ ff. 


Write the limit as a fraction with the integral in the numerator and 1 in the denominator, and use l’H6pital’s 
Rule. 


b-x 
b-a 
Use integration by parts twice. 


Let 9(x) = f(@+ — f(b). Compute g(c) — f(c) for any c € (a,b). 


To integrate csc x, look for trig expressions that have csc x in their derivatives. 


Try a “rearranged” version of integration by parts. 


HINTS TO SELECTED PROBLEMS 


103. 
104. 
105. 
106. 
107. 
108. 
109. 
110. 
niet. 
112. 
113. 
114. 
IS: 
116. 
LTA. 
118. 
LI9; 
120. 
121, 
122. 
123. 
124. 


125. 


126. 


12Y. 


128. 
129. 


130. 


131. 
132. 


Note that f~!(y) is a number but f~'({y}) is a set. 

Use integration by parts. 

Can the integral of a positive function ever be 0? 

Look for a parameterization similar to part (a), but where the speed is appropriate. 
Is something in this integral the derivative of something else in the integral? 

Use the symmetry of the sine and cosine functions. 

Perform long division first. 

Substitute v = y’, solve for v, then integrate. 

First solve y’ = y, then worry about the “extra” x term. 

This is pretty hard to solve. You can try the substitution y = ox. 

Start by showing sup(A O B) < min{sup A, sup B}. 

At some point you'll want to differentiate both sides. Differentiating multiple times might help too. 
Write the direction of the string as a function of t. 

You might have to make an educated guess, and then check that your guess works. 
Use the Mean Value Theorem. 

Computing the tangent of what you want might be easier. 

Drawing a picture will help a lot. 

Try the Limit Comparison Test. What's a good series to compare to? 

There are two possibilities. 

Note f(0) = f(1) = 0 and f(x) > 0 for all x € (0,1). Try to find the maximum on [0, 1]. 
Try to create a “hole” in your union. 


Write the power series for sin x, then integrate term-by-term. Be a little careful about the resulting constant 
of integration. 


This looks a lot like the previous problem, so the same technique should work. 


The denominator factors, so use partial fractions. 


You may find Pascal’s Identity useful: (,. ) n 4 7 ‘ s ‘| 


It may be easiest to first convert to rectangular coordinates. 


Try to get powers of 2 in the numerator. 


1 
What is { f(x)(a — x)* dx? 
0 


How far around the big circle will the small circle be when the dot first retouches the big circle? 


An example might help. 
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HINTS TO SELECTED PROBLEMS 


133. The substitution u = x — 1 will make the integral look more symmetric. 
134. Use integration by parts (twice) to find an equation relating { sin" x dx and i sin" * x dx. 


135. Multiply and divide by the conjugate: Vx? + x + x. 
136. Use integration by parts. 


137. Use the function y = «fb? — (x — a)? for the cross-section of the upper half of the torus. 
138. The easiest way to show that g is constant is to show that g’ = 0. 

139. Just compute it. 

140. Prove by induction. 

141. Write a function for the total travel time, then minimize it. 

142. This is essentially the same argument as Problem 6.4 with some inequalities reversed. 
143. Look for a function whose derivative has two roots plus a double root. 


144. We compute 4-dimensional volume by integrating over 3-dimensional cross-sectional volumes (just as we 
compute 3-dimensional volume by integrating over 2-dimensional cross-section areas). 


145. After completing the square, use a trig substitution. 
146. What is the logical choice for the common ratio of the geometric series? 


147. Let g(x) be the function whose graph is the line from (a, f(a)) to (b, f(b)). Your goal is to show that g(c) > f(c) 
for all c € (a,b). 


148. It may be helpful to measure angles clockwise starting at the 12:00 position. 
149. Try to compute it recursively. 
150. Again, try to use partial fractions. 

sin x 

cos x 


151. Write tan x as 


152. Look for g whose range contains both positive and negative values. 


sinax sinax Xx 


153. Writ : : 
oe, Werue sin bx bi sin bx 


154. If either m or n is odd, use sin? x + cos* x = 1 repeatedly so that we can eventually substitute u = sinx or 
u = cosx. 


155. This looks a lot like the Intermediate Value Theorem. 
156. Convert f — g to a quotient of fractions involving f and/or g. 
157. Integration by parts seems like a reasonable thing to try. 


158. You'll need to use the Triangle Inequality. 
159. For the other inequality, show that log(n + 1) — log(n) > a. 
160. Use partial fractions. 
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161. 


162. 
163. 


164. 
165. 
166. 
167. 


168. 


169. 
170. 
171. 
172. 
173: 
174. 
175. 
176. 
177. 
178. 
179. 
180. 
181. 
182. 
183. 
184. 
185. 


186. 
187. 


188. 


189. 


HINTS TO SELECTED PROBLEMS 


Try to turn f — g into a § indeterminate form. 


Start with fa and divide numerator and denominator by fg. 


Show that lim f(x) = 0 for alla € R. 
x—a 


It doesn’t really matter that r is a root. 
This is just a matter of chasing the definitions. 
“The tangent lines ... are parallel” means that f’(a) = f’(b). 


Take a definite integral of areas of appropriate cross-sections. 


If x is the distance and 6 is the angle, show that @ = tan7! = —tan=! ial 


Factor out what you can, then multiply and divide to get rid of terms with negative exponents. 
It can be false if g is not continuous. 

All of these parts are actually the same—why? 

Plugging in nice values for x and/or y is a good strategy. 

Try substituting y = 0. 

We'd be much happier working with sin? 6 and cos? 9. What's a relationship between sin? @ and sin 9? 
Divide the numerator and denominator by x”. 

Use the Squeeze Theorem. 

You'll probably need to use integration by parts 7 times. 

The integral that you get will likely require a trig substitution to evaluate. 

Substitute u = cos 0. 

Substitute in the more complicated term. 

Substitute u = —2x. 

Use integration by parts twice. 

Pick a few points and try to connect the dots. 

Can you factor the equation that results after differentiation? 


For a given €, choose a positive integer n such that e > +. Then describe how to choose 6 so that | f(x)| < 4 for 
all 0 < |x -al < 6. 


Write g(x) as a linear function in terms of f(a) and f(b). 


Note that f is bounded on a closed interval, so bound f between its minimum and maximum, and use part 


(a). 


Use the Series Comparison Test. 


Note that - = : for some constant k (where s is the distance traveled by the plow). 
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190. 
191. 
192. 
193. 
194. 


195. 
196. 
197. 
198. 
199, 
200. 
201. 
202. 


203. 


204. 
205. 
206. 


207. 
208. 
209. 
210. 


211. 
212: 
213. 
214. 


215. 
216. 
217. 


218. 
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Integrate the cross-sectional areas as a function of z. 

Start by computing ((fg)h)’ using the Product Rule. 

If both m and n are even, try the half-angle formulas. 

Experiment with some simple functions, and take a guess at the answer. This should help suggest a strategy. 


Start with the equation of a circle of radius b centered at (4,0), and rotate it about the y-axis using the 
cylindrical shell method. 


Use integration by parts with u = x*. 

vx =x. 

Try the substitution u = (/x. 

What substitution would get rid of all the fractional exponents? 


Multiplying by r may make it easier to convert to rectangular coordinates. 


Then, find a function of the form y = se~* that is a solution to the equation, for some constant s. 
Try the Ratio Test. 
Try the substitution u = sin” x. 


What's the relationship between lim = and lim bn» 
00 Dy 


noo an 
Factor the denominator first. 
Use partial fractions to write the sum as a telescoping sum. 


Note the limacon passes through the origin when sin @ = —3, and forms the boundary of the inner white 
region when sin @ < —}. 


Use the Mean Value Theorem. 

You shouldn’t need to write out f’(x) in order to compute f’(1). 

The pictures from part (b) should give you a good idea of where the full stops might occur. 
You should get the sequence 1, —2,4, —8, 16,.... 

Show that h should satisfy h’(x) = h(x)p(x). 

Compute ['(2), [(3), ['(4) using your formula from part (b). Do you see the pattern? 

Let g(x) = f(x) — f(2x) and h(x) = f(x) — f(4x). Write an equation relating h’(x), 9’(x), and 9’(2x). 
Complete the square in the denominator. 

Show that if f is odd, then an antiderivative of f is even. 

What's the equation for a hyperbola? 

Before computing derivatives, do you recognize this expression as the sum of a series that you know? 


Try computing in terms of the angle at which Sam chooses to start rowing (relative to the diameter of the 
lake). 


219, 
220. 


221. 
222. 
223. 
224. 
225. 
226. 
227. 


228. 


229. 
230. 


2A. 
232. 
233. 
234. 


235: 


236. 
237. 
238. 
239. 
240. 
241. 
242. 
243. 
244. 
245. 


HINTS TO SELECTED PROBLEMS 


The key fact is that 100 log 2 is reasonably close to 72. 


Show that if sup(A N B) < min{sup A, sup B}, then there is a contradiction, by finding an element of AN B 
strictly larger than sup(A /O B). 


Another method is to differentiate the original series. 

Use the Ratio Test. 

Substitute u = x’. 

Try computing in terms of the angle at which Sam chooses to start rowing (relative to the shore of the river). 
Let f be an odd function. 

Use induction. 


Does Q contain any intervals of IR as subsets? 
sf 
Compute lim x id = dt. 
x0 i IE 


There is only one function with range @Q—what is it? 


You need to show that for any c,d € (a,b), that 


[raadntn df sfeg 


Write ;4> =1- 75. 
Try drawing a right triangle. 
Try substituting u = Ve* +1. 


Use a bounding argument. 


dx? 


1 
How is our sum related to { 
0 V1l—x? 


Use the identity 1 + sinh? x = cosh? x. 

Use the tangent double-angle formula to compute tan (tan 4+ tan} 1), 

Use the substitution u = | — x. How does what you end up with relate to the original integral? 
Break up the series into a sum of two geometric series. 

What function does this look like the Taylor series of? 

If P,, is the partition of [a,b] into n equal-sized pieces, what is u(f, Pn) — 1(f, Pn)? 

It should be straightforward to show that a, < b, for sufficiently large n. 

Given c,d in the interval, you must show that any x with c < x < d is also in the interval. 

f is continuous on [0,1], so |f(x)| has an upper bound on [0, 1]. 


We can multiply Taylor series. 
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246. 


247. 
248. 
249. 


250. 
Zoli 


252. 
253: 
254. 
255. 
256. 
257. 
258. 
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260. 


261. 


262. 
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267. 
268. 
269. 


270. 
AZ|. 
272: 


275: 
274. 
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Applying the sine angle-subtraction formula at some stage may make your answer look much nicer. 
Since the function is differentiable, we must have f’(a) = 0. 
To show this curve is the same as the astroid, you’ll need the trig triple-angle formulas. 


Since we know that the Mean Value Theorem holds for differentiable functions, you should be looking for a 
function that is not differentiable. 


Does the left side of the equation look like the derivative of a product? 

If lim f’(x) does exist, what must it be? 

Think about it a bit first before diving into computations. What do you expect the answer to be? 
Use the fact that the picture is symmetric across the line y = x. 

Use what you know about the ranges of sin“! and cos“. 

Start by drawing a picture. 

Expand the numerator and then do long division. 

For the n = 1 case, p-series are your best examples. 

Start with sin? 6 + cos? 6 = 1 and differentiate. 


Substitute u = x° +1. 
|E(z)I 


(z—a)? 


Find a bound, in terms of f’, for , and then use the Mean Value Theorem to write the bound solely in 


terms of f”. 


Suppose the leading term of f(x) is cx" for some constant c and some positive integer n. What are the leading 
terms of f’(x) and f’’(x)? 


Use Rolle’s Theorem. 

Differentiate the given equation. How are f and f” related? 
Write out a few terms of (1 — x — x*) f(x) to see what's going on. 
Substitute x = u°. 

Manipulate using differences of squares. 

Recall sin? x + cos? x = 1. 

When does P first again touch C? 


If we take the area inside the entire limacgon for @ € [0,27], how many times is the small white inner loop 
region counted? 


Write the sum as the derivative of some function. 

How can we introduce some symmetry into the integral? 
You'll need to express this area as a difference of two integrals. 
The cross-sections are squares! 


Recall that a* = e*!84, 


275. 
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292. 
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299. 
300. 


301. 


HINTS TO SELECTED PROBLEMS 


Write tan 0 = a 

Use part (a) with the Product Rule to compute ( f: 1)’ 

If f < g < hand the limit of f and h is L, consider the inequality f(x) — L < g(x) —L < h(x) -L. 
Consider the function g(x) = f(x) — x. 


|E(z)| 


Write an expression for 
ie 


Look first for the “obvious” solutions. 


, then use the Mean Value Theorem to write your expression solely in terms of 


xX 
Let G(x) = { g(t) dt, and compute tof (x)) using the Chain Rule. 
0 
Note that lim g(x) = lim g(—x) for any function g (provided either limit is defined). 
x7 x 


At some point, the substitution v = e“ will probably help. 


One method is to use partial fractions to split each term into two fractions, then write the original series as a 
sum of two series. 


The nature of the graph will depend on the sign of a — b. 
Does the integrand look like the derivative of a product? 


The next hint will tell you most of the possible functions. If you’re stuck, you might be able to reverse-engineer 
the solution once you know the answer. 


Does the value of € matter? 

Use the Limit Comparison Test with ¥, 4. You may need to use l’H6pital’s Rule to evaluate the resulting limit. 
Factor the denominator, but not all the way. 

Write f(x) = px? + qx? + rx + s for some unknown coefficients p,q, 1,s. 

Note that f’(x) = cot x, so V1 + (f'(~))? = V1 + cot? x. What trig identity can we now use? 

Look at trig functions. 

Substitute x = 2 tan 0. 

Note cos? x < cosx < 1 for all x. This suggests the Squeeze Theorem. 

Go for broke—substitute u = 2 + x and see what happens. The integrand should become a polynomial 
in u. 

The factor in the denominator is also a factor of part of the numerator. 

Let lim f(x) = Fand lim g(x) = G. Write f(x)9(x) — FG as (f (x)g(x) — f(x)G) + (f(x)G — FG) and use the Triangle 
Inequality. 

Compute the Taylor series of (1 + x)?. 


We’d like to make the substitution u = x° + 3x* + x, but where are we going to find du? 


Suppose the right intersection point of the two curves is at x = b. What do you know about [ (2x-3x3 —c) dx? 
0 
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6-e definition, 38 

Q, 2 

¢ indeterminate form, 200-202, 204 
= indeterminate form, 202-203 

y, see Euler’s Constant 

€,2 

co, see infinity 

©, 33 


30-60-90 triangle, 21 
4-dimensional sphere, 185 
45-45-90 triangle, 21 


acceleration, 106 

due to gravity, 107-108, 123 
algebraic numbers, 9 
alternating harmonic series, 239 
alternating series, 238-241 
Alternating Series Test, 239-240 
angle-addition formulas, 26, 29 
angle-subtraction formulas, 26 
annulus, 173 
antiderivative, 135, 141 

linearity, 142 

of a continuous function, 139 
antidifferentiation, 140, see also integral 
AoPS, see Art of Problem Solving 
arccos, see inverse cosine 
Archimedes spiral, 277 
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arcsin, see inverse sine 
arctan, see inverse tangent 
area, 166-169 
between curves, 166 
ellipse, 167-169 
in polar coordinates, 274-276 
area under a curve, 126-135 
as sum of rectangles, 130-131 
Darboux sum, 131 
definite integral, 131 
lower area, 131 
parabola, 128-130 
upper area, 131 
argument, 268 
Art of Problem Solving, vi, 324 
associativity, 8 
astroid, 265-267 
average value, 176-178 


Binomial Theorem, 257 

boundary point, 99 

bounded, 9 

Boundedness Theorem, 51, 55 
proof of, 55 

Boyle’s Law, 122 


C,.35 

calculator, vii, 165 

cardinality, 2 

cardioid, 275-277 

Cartesian plane, 16 

Chain Rule, 73, 157 
differentiation, 72-74 
integration, 147-153 
notation, 74 
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proof, 84-86 
characteristic polynomial, 292 
circle 
as parametric curve, 259-260 
in polar coordinates, 269 
closed interval, 11 
codomain, 13 
commutativity, 8 
completeness, 9-10 
complex numbers, 35 
composition (of functions), 14 
concave down, 92, 93 
concave up, 93 
concavity, 91-95 
conditionally converge, 240-241 
continuity, 48-52 
algebraic properties, 49 
antiderivative, 139 
Boundedness Theorem, 51, 55 
definite integral, 134 
Extreme Value Theorem, 51 
Intermediate Value Theorem, 50, 54-55 
localness of, 49 
continuous, 48 
converge 
absolute, 238-241 
conditionally, 240-241 
improper integral, 207, 210 
power series, 249-251 
sequence, 221-225 
geometric, 222 
series, 227, 229-241 
convex, 94 
coordinates 
polar, see polar coordinates 
cos~!, see inverse cosine 
cosecant, 24 
cosh, 34, 185 
cosine, 20-24 
angle-addition formula, 26 
angle-subtraction formula, 26 
derivative of, 70 
domain and range of, 23 
double-angle formula, 27 
half-angle formula, 29 
hyperbolic, 34 
integral, 146-147 
inverse, see inverse cosine 
period of, 24, 26 
Taylor series, 251-252 
cotangent, 24 
counting numbers, 8 


critical point, 96-99 

cuboids, 169 

curve 
in polar coordinates, 269-271 
parametric, see parametric curve 

curve length, see length 

curve sketching, 90-95 

cycloid, 261-265, 267 

cylindrical shell, 172 


damped oscillation, 104 
Darboux sum, 131 
properties, 132 
decreasing, 88, 90, 223 
definite integral, 126, 131, see also integral 
approximation techniques, 179-183 
Simpson’s Rule, 181-182, 188-190 
Trapezoid Rule, 180-181 
as continuous sum, 176 
average value, 176-178 
Chain Rule, 151-152 
continuous function, 134 
Fundamental Theorem of Calculus, 135-140 
integrand, 132 
limits of integration, 132 
linearity, 142 
Mean Value Theorem, 179 
properties, 132 
Simpson’s Rule, 181-182, 188-190 
Trapezoid Rule, 180-181 
derivative, 57, 60, 61 
as rate of change, 90, 110 
Chain Rule, 72-74 
notation, 74 
proof, 84-86 
construction of, 59-60 
definition, 60 
exponential, 70-71 
first derivative, 91 
First Derivative Test, 101 
implicit, 80-82 
inverse function, 75-76 
inverse trig functions, 76 
Leibniz Rule, 84 
linearity, 65 
logarithm, 70-71 
Mean Value Theorem, 79-80 
proof, 86-87 
monomial, 65-67, 77 
Newton’s Method, 113-117 
notation, 61 
of a constant, 90 
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polynomial, 67-68 
Product Rule, 68-69 
Quotient Rule, 69, 71 
related rates, 117-122 
relationship to continuity, 62 
Rolle’s Theorem, 77-79 
second derivative, 77 
Second Derivative Test, 101 
tangent line approximation, 108-113 
trig functions, 69-70 
use in curve sketching, 90-95 
where undefined, 64 
zero, 90 

differentiable, 60 

differential equation, 279-295 
characteristic polynomial, 292 
Euler’s Method, 297-299 
exponential decay, 286 
exponential growth, 285-286 
first-order, 280 
heating, 287 
homogeneous, 290 
initial condition, 279 
integrating factor, 296 
linear, 290-295 
logistic, 287-289 
radioactive decay, 286 
relative rate (of growth), 285 
second-order, 290-294 
separable, 283 
separation of variables, 282-285 
slope field, 280-282 
uniqueness of solution, 282 

differential notation, 61 

discrete function, 221 

disjoint (sets), 6 

distributive, 8 

diverge 
improper integral, 207, 210 
sequence, 221-225 
series, 227 

Divergence Test, 230 

domain, 13 

dominate, 203-204 
little-o notation, 214 

double-angle formula, 27, 29 

dummy variable, 13, 132 


e, 31, 32 
as a limit, 204-205 
formula for, 249 
element (of a set), 1 
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ellipse, 167-169 
empty set, 2, 4 
endpoints, 259 
epicycloid, 277 
equilibrium, 287 
errata, vii 
error function, 140, 185, 214 
Euler’s Constant, 257 
Euler’s Formula, 35, 252 
Euler’s Method, 282, 297-299 
exp, see exponential 
exponential, 30, 185-188 
derivative of, 70-71 
integral, 146-147 
Taylor series, 248-249 
exponential function, 31, 32 
exponential growth, 285-286 
exponential indeterminate form, 204—206 
Extended Mean Value Theorem, 215 
extrema, 96 
Extreme Value Theorem, 51 
extreme values, 96 


Fibonacci numbers 
Taylor series, 257 

Fibonacci sequence, 221 

field, 8 

finite set, 1 

first derivative, 91 

First Derivative Test, 101 

fixed point, 53 

Fourier series, 302 

frustum, 121 

function, 13-16 
antiderivative, 135 


area under graph, see area under a curve 


asymptote 
horizontal, 192 


vertical, 198-199 
average value, 176-178 
codomain, 13 
composition, 14 
concave down, 92, 93 
concave up, 93 
concavity, 91-95 
continuous, 48-52 
convex, 94 
critical points, 96-98 
decreasing, 88, 90 

strictly, 89, 90 
derivative of, 61, see also derivative 
differentiable, 60 
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discrete, 221 
domain, 13 
dominate, 199, 203-204, 214 
exponential, 29-32, 185-188 
derivative of, 70-71 
extrema, 96 
extreme values, 96 
First Derivative Test, 101 
global maximum, 99 
global minimum, 99 
graph, 16-19 
image, 15 
increasing, 88, 90 
strictly, 89, 90 
inflection point, 93 
inverse, 14 
length, 175 
local maximum, 100 
local minimum, 100 
logarithm, 32-33, 185-188 
derivative of, 70-71 
maximum, 99 
global, 99 
local, 100 
relative, 100 
minimum, 99 
global, 99 
local, 100 
relative, 100 
monotonic, 88 
strictly, 89 
nondifferentiable, 64 
odd, 140 
optimization, 96-105 
parametric, 259 
periodic, 24 
preimage, 15 
range, 13 
rational, 193-195, 236 
real-valued, 13 
relative maximum, 100 
relative minimum, 100 
Riemann zeta function, 235 
scaling, 18-19 
strictly decreasing, 89, 90 
strictly increasing, 89, 90 
strictly monotonic, 89 
translation, 18-19 
trigonometric, 19-29 
derivative of, 69-70 
Fundamental Theorem of Algebra, 52 
Fundamental Theorem of Calculus, 126, 135-140, 178 


gamma function, 214 
geometric sequence, 220-221, 225 
common ratio, 221 
convergence, 222 
geometric series, 226-229 
global maximum, 99 
global minimum, 99 
Grand Integrator, 147 
graph, 16-19 
length, 175 
gravity, 107-108, 123 
greatest integer function, 48-49 
greatest lower bound, 10 
Green’s Theorem, 267 


H6pital, see l’H6pital’s Rule 
half-angle formula, 27, 29 
harmonic series, 230-231 
alternating, 239 
Harvard-MIT Mathematics Tournament, iv 
HMMtT, see Harvard-MIT Mathematics Tournament 
homogeneous, 290 
Hooke’s Law, 294 
horizontal asymptote, 192 
Horizontal Line Test, 18, 19 
hyperbolic cosine, 185 
hyperbolic trig functions, 34 
hypocycloid, 266, 267 


identity element, 8 
image, 15 
imaginary number, 35 
imaginary part, 35 
implicit differentiation, 80-82 
related rates, 117-122 
improper integral, 207-213 
comparison test, 209-210 
converge, 207, 210 
diverge, 207, 210 
improper at both ends, 211-213 
increasing, 88, 90, 223 
indefinite integral, 141-142, see also integral 
linearity, 142 
indeterminate form 
8, 200-202, 204 
=, 202-203 
co — co, 206 
exponential, 204-206 
index, 220 
infimum, 10 
infinite set, 1 
infinity, 191-199 
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as a limit, 196-199 
not a number, 197 
inflection point, 93 
initial condition, 279 
integers, 7,8 
integral 
Chain Rule, 153 
definite, see definite integral 
exponential, 146-147 
improper, see improper integral 
indefinite, 141-142 
integrand, 132 
integration by parts, 153-157 
limits of integration, 132 
linearity, 142 
partial fractions, 161-162 
polynomials, 142-146 
substitution, 157-161 
trig substitution, 157-160 
trigonometric, 146-147, 152-153, 165 
integral sign, 132 
Integral Test, 233-234, 237 
integrand, 132 
integrating factor, 296 
integration, see integral 
Product Rule, 153 
integration bee, see MIT Integration Bee 
integration by parts, 153-157 
Intermediate Value Theorem, 50, 54 
proof of, 54-55 
intersection, 5 
interval, 10-12 
closed, 11 
half-open, 11 
open, 11 
partition of, 130 
inverse, 8, 14 
inverse cosine, 25 
inverse function 
derivative of, 75-76 
Inverse Function Rule, 75-76 
inverse sine, 25 
derivative of, 76 
integral, 159, 166 
inverse tangent, 25 
derivative of, 76 
integral, 159 
Taylor series, 255 


l’H6pital’s Rule, 200-206 
2 indeterminate forms, 200-202, 204 
= indeterminate forms, 202-203 
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exponential indeterminate forms, 204-206 
proof, 215-219 
least upper bound, 10 
left Riemann sum, 135 
Leibniz Rule, 84 
Leibniz, Gottfried, iii 
length, 173-176, 265 
of parametric curve, 265, 278 
limacgon, 273, 276 
limit, 37-48 
6-€ definition of, 38 
at infinity, 191-196 
as finite limit, 195-196 
comparison of, 43 
infinite, 196-199 
one-sided, 45-47 
properties, 42-43 
rational function, 193-195 
sequence, 222 
Squeeze Theorem, 44, 48 
uniqueness of, 41-42 
Limit Comparison Test, 235-237 
limits of integration, 132 
line 
as parametric curve, 260-261 
in polar coordinates, 273, 277 
linear, 65, 142 
linear approximation, 111 
links, vii, 303 
Lissajous curve, 267 
little-o notation, 214 
In, 32, see also logarithm 
local, 49 
local linearization, 110 
local maximum, 100, 101 
local minimum, 100, 101 
log, 32, see also logarithm 
logarithm, 32, 185-188 
derivative of, 70-71 
integral, 156 
Taylor series, 252-254 
logistic equation, 287-289 
lower area, 131 
lower bound, 9 
lower Darboux integral, 131 
lower Darboux sum, 131 


Maclaurin polynomial, 245 
Maclaurin series, 247 
magnitude, 268 

maximal element, 11 
maximum, 11, 99 
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Mean Value Theorem, 79-80, 86, 179 
extended, 215 
proof, 86-87 
member (of a set), 1 
midpoint Riemann sum, 135 
minimal element, 11 
minimum, 11, 99 
MIT Integration Bee, 147, 184 
monotonic, 88, 223 


IN, 8 
natural logarithm, 32, 185 
natural numbers, 8 
Newton's Law of Heating, 287 
Newton’s Method, 113-117 
Newton, Isaac, iii 
normal distribution, 301 
numbers 
algebraic, 9 
integers, 8 
positive, 7 
natural, 8 
rational, 8 
real, 9 
completeness of, 9-10 
construction of, 9-10 
transcendental, 9 


one-sided limits, 45-47 
open interval, 11 
operator notation, 61 
optimization, 96-105 


p-series, 230, 232-235 
parameterization, 259 
parametric curve, 259-266 
astroid, 265-267 
circle, 259-260 
cycloid, 261-265, 267 
hypocycloid, 267 
length, 265, 278 
line, 260-261 
polar curve, 271 
speed, 264-265 
tangent line, 262-264 
parametric function, 259 
partial fractions, 161-162 
partial sum, 226, 227 
partition, 130 
period, 24 
periodic, 24 
polar coordinates, 267-276 


Archimedes spiral, 277 
area, 274-276 
argument, 268 
cardioid, 275-277 
circle, 269 
conversion to/from rectangular, 268 
curves, 269-271 
epicycloid, 277 
limacon, 273, 276 
line, 273, 277 
magnitude, 268 
rose, 269-270, 278 
area, 275 
tangent line, 271-273 
positive integers, 7 
power series, 247-255 
convergence, 249-251 
radius of convergence, 249-251 
preimage, 15 
probability density function, 301 
Product Rule, 69 
differentiation, 68-69 
integration, 153 
proof by contradiction, 9 
proper subset, 3 
Putnam, see William Lowell Putnam Mathematical Com- 
petition 
pyramid, 169-170 


Q8 

Q9 

quadratic approximation, 241-243 
Quotient Rule, 69, 71 


R,9 
Racetrack Theorem, 123 
radians, 20 
radius of convergence, 249-251 
range, 13 
rate of change, 90, 110 
Ratio Test, 236-237 
Taylor series, 250-251 
rational function, 193-195, 198-199 
series, 236 
rational numbers, 8 
real numbers, 9 
construction of, 9-10 
real part, 35 
real-valued, 13 
rectangular coordinates, 268 
recursive, 221 
recursive formula, 221 
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related rates, 117-122 
relative maximum, 100 
relative minimum, 100 
resources, Vi 
Riemann Hypothesis, 235 
Riemann sum, 134-135 
Riemann zeta function, 235 
right Riemann sum, 135 
Rolle’s Theorem, 77-79 
Root Test, 238 
rose, 269-270, 278 

area, 275 
Rule of Seventy-two, 124 
Russell’s paradox, 34 


secant, 24 
secant line, 57, 58 
second derivative, 77,91 
Second Derivative Test, 101 
second-order, 243 
separable, 283 
separation of variables, 282-285 
sequence, 220-225 
bounded, 222-225 
converge, 221-225 
decreasing, 222-224 
diverge, 221-225 
Fibonacci, 221 
geometric, see geometric sequence 
increasing, 222-224 
index, 220 
limit, 222 
monotonic, 222—224 
recursive definition, 221 
series, 226-255 
absolute convergence, 238-241 
alternating, 238-241 
alternating harmonic, 239 
Alternating Series Test, 239-240 
conditionally converge, 240-241 
converge, 227, 229-241 
diverge, 227 
Divergence Test, 230 
geometric, 226-229 
harmonic, 230-231 
Integral Test, 233-234, 237 
Limit Comparison Test, 235-237 
Maclaurin, 247, see also Taylor series 
p-series, 230, 232-235 
partial sum, 226, 227 
power, see power series 
Ratio Test, 236-237 
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rational functions, 236 
Root Test, 238 
Series Comparison Test, 231-232 
sum, 226 
Taylor, see Taylor series 
telescoping, 229 
Series Comparison Test, 231-232 
set, 1-7 
cardinality, 2 
difference, 6, 7 
disjoint, 6 
distributive law, 6 
element, 1 
empty, 2, 4 
finite, 1 
infinite, 1 
intersection, 5 
member, 1 
Russell’s paradox, 34 
subset, 2 
proper, 3 
superset, 3 
symmetric difference, 33 
union, 4 
set difference, 6, 7 
sign graph, 90 
pw s Rule, 181-182, 188-190 
, see inverse sine 
sine, 20-24 
angle-addition formula, 26 
angle-subtraction formula, 26 
derivative of, 69-70 
domain and range of, 23 
double-angle formula, 27 
half-angle formula, 29 
hyperbolic, 34 
integral, 146-147 
inverse, see inverse sine 
period of, 24, 26 
Taylor series, 251-252 
sinh, 34 
slicing, 170-171 
slope field, 280-282 
Snell’s Law, 124 
solid of revolution, 171-173 
sphere, 171 
4-dimensional, 185 
square wave, 302 
Squeeze Theorem, 44, 48 
strictly decreasing, 30, 89, 90, 223 
strictly increasing, 89, 90, 223 
strictly monotonic, 89 


INDEX 


subset, 2 along parametric curve, 264-265 
proper, 3 vertical asymptote, 198-199 
warning about notation, 3 Vertical Line Test, 17, 19 

substitution method, 157-161 volume, 169-173 

sum, 226, 227 4-dimensional sphere, 185 

superset, 3 by slicing, 170-171 

supremum, 10, 223 cylindrical shell method, 172 

symmetric difference, 33 pyramid, 169-170 
‘ ; solid of revolution, 171-173 

tan", see inverse tangent sphere, 171 

tangent, 20 torus, 178 
angle-addition formula, 29 
derivative of, 70 whole numbers, 8 
domain and range of, 23 William Lowell Putnam Mathematical Competition, iv 
double-angle formula, 29 Wolfram|Alpha, vii 
half-angle formula, 29 
inverse, see inverse tangent Z, 8 
period of, 26 

tangent line, 57-60, 263 
approximation using, 108-113 
definition, 60 


in polar coordinates, 271-273 
to parabola, 57-58 
to parametric curve, 262-264 
tangent line approximation, 108-113, 241 
error, 110-111, 113, 124 
Newton’s Method, 113-117 
Taylor polynomial, 241-247 
error, 245-247 
Taylor series, 247-255 
differentiation, 254-255 
exponential, 248-249 
logarithm, 252-254 
radius of convergence, 249-251 
Ratio Test, 250-251 
trigonometric, 251-252, 255 
telescoping, 136, 229 
torus, 178 
transcendental number, 9 
Trapezoid Rule, 180-181 
triangle 
30-60-90, 21 
45-45-90, 21 
trig substitution, 157-160 


union, 4 

unit circle, 22 

upper area, 131 

upper bound, 9 

upper Darboux integral, 131 
upper Darboux sum, 131 


velocity, 106 
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